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Chimeric Transcription Factors 

Related Applications 

This application is a continuation in part of USSN 09/262,721 filed March 4, 1999, 
10 which is a continuation in part of USSN 09/096,732 filed June 1 1 , 1998, which is a 

continuation-in-part of USSN 08/672,213, filed June 27, 1996 which was a continuation of 
USSN 60/000,553, filed June 27, 1995 and USSN 60/019,614, filed December 29, 1995, the 
full contents of which are incorporated herein by reference. 

15 Introduction 

A variety of applications involving gene transcription, including among others, gene 
therapy, production of biological materials and biological research, depend on the ability to 
elicit specific and high-level expression of genes encoding RNAs or proteins of therapeutic, 
commercial or experimental value. Achieving a sufficiently high level of expression for clinical 

20 or other utility in genetically engineered cells within whole organisms has often been a limiting 
problem. Various approaches for addressing this problem, including the search for stronger 
transcriptional promoters or higher transfection efficiencies, have in many cases not met with 
success. Meanwhile, in various lines of research with transcription factors, promising results 
in transient transfection models have not been borne out with chromosomally integrated 

25 reporter gene constructs. Furthermore, overexpression of transcription factors is commonly 
associated with toxicity to the host cell. Despite those precedents, this invention takes a 
novel approach to the challenge of optimizing heterologous gene expression through new 
uses of, and new designs for, transcription factor proteins which are expressed within the 
engineered cells containing the target gene. The invention provides improved methods and 

30 materials for achieving high-level expression of a target gene in genetically engineered cells, 
including genetically engineered cells within whole organisms. 

Summary of the invention 

This invention involves improvements in transcription activation domains and their 
35 use in fusion proteins referred to as "chimeric transcription factors". The invention further 
involves DNA sequences encoding those chimeric transcription factors, transcription control 
sequences responsive to the chimeric transcription factors, target gene constructs containing 
a target gene operably linked to such a transcription control sequence, genetically engineered 
cells which contain a target gene construct and a construct for expressing a chimeric 
40 transcription factor of the invention, organisms containing such cells, methods for producing 



1 




such cells and organisms, and methods for using the foregoing in gene therapy, production of 
biological materials and biological research. 

Of particular interest are recombinant nucleic acids (typically DNA, although 
embodiments involving RNA are within the scope of the invention) encoding a chimeric 
5 transcription factor which (a) contains at least two mutually heterologous domains comprising 
a p65 domain and a ligand binding domain, and (b) is capable of activating the transcription 
of an appropriate target gene construct, discussed below, in a ligand-dependent manner. 
The p65 domain comprises part or all of the peptide sequence spanning positions 281-550, 
321 -550, 361 through 550, 361 through 450 or 450 through 550 of NF-kB p65 (preferably 

10 human), or a peptide sequence derived therefrom, A "p65 domain" as that term is used herein 
denotes a peptide sequence from about 40 amino acids up to several hundred amino acids in 
length, as discussed in further detail and illustrated below. To avoid confusion, peptide 
sequences derived from p65 which contain fewer than 40 amino acids are referred to as "p65 
motifs". In addition to the p65 and ligand binding domains, the chimeric transcription factors 

15 may further contain one or more additional, optional domains, including for instance, one or 
more transcription potentiating domains and/or a nuclear localization sequence. 

Preferred chimeric transcription factors stimulate transcription of a target gene in a 
ligand-dependant manner as disclosed in greater detail below. Preferably the difference in 
level of transcription observed in the presence and absence of ligand, respectively, is at 

20 least two, more preferably three, and even more preferably four or more orders of magnitude. 

In various embodiments the chimeric transcription factor contains one or more copies 
of one or more p65 domains, optionally together with one or more copies of one or more 
different transcription activation domains, subdomains or potentiating motifs derived, for 
example, from p65 or other transcription activation domains (collectively, *1ranscription 

25 potentiating domains"). Transcription activation domains comprising a non-naturally occurring 
peptide sequence containing either two or more heterologous activation domains, one 
activation domain and two or more copies of a reiterated peptide sequence or two or more 
copies of a reiterated peptide sequence constitute "composite transcription activation 
domains". One illustrative class of composite transcription activation domains comprise 

30 domains containing (a) two or more p65 domains (which may be the same or different), (b) 
one or more p65 domains plus one or more p65 motifs, or (c) one or more p65 domains 
together with one or more copies of one or more transcription activation potentiating domains. 

Transcription potentiating domains are peptide sequences which can be shown to 
potentiate the transcription activation potency of a transcription factor (relative to the 

35 corresponding chimeric transcription factor lacking that potentiating domain). Illustrative 
potentiating domains comprise motifs which may be selected or derived from the so-called 
"proline-rich", "glutamine-rich" and "acidic" activation motifs such as the VP16 V8 motif 



2 



(DFDLDMLG), the related "V9" motif (DFDLDMLGG), a human activation motif such as the 
14 amino acid acidic motif of human heat shock factor (HSF) or an atanine/proline-rich motif 
selected from p53 or CTF (preferably human) including such motifs which are homologous to 
alanine/proline rich motifs within residues 361 - 450 of p65. Alternatively, an activation 
5 domain such as amino acids 431 -529 of HSF may be fused to p65 to form a composite 
activation domain. An exemplary composite activation domain (referred to as S3H) 
comprises the peptide sequence p65(281 -551 ) linked to the peptide sequence HSF(406- 
530). 

Transcription factors of this invention may be operatively linked to any promoter 

10 selected by the practitioner, including strong promoters such as GMV or weaker promoters 
like RSV, the MCK enhancer, or promoters for endogenous human proteins. 

A wide variety of ligand binding domains may be used in this invention, although 
ligand binding domains which bind to a cell permeant ligand are generally preferred. It is also 
preferred that the ligand have a molecular weight under about 5kD, more preferably below 

15 2.5 kD and optimally below about 1500 D. Non-proteinaceous ligands are also preferred. 
Ligand binding domains include, for example, domains selected or derived from (a) an 
immunophilin (e.g. FKBP 12), cyclophilin or FRAP domain; (b) a hormone receptor such as a 
receptor for progesterone, ecdysone or another steroid; and (c) an antibiotic receptor such as 
a tetR domain for binding to tetracycline, doxycycline or other analogs or mimics thereof. 

20 A tetR domain useful in the practice of this invention may comprise a naturally 

occurring peptide sequence of a tetR of any of the various classes (e.g. class A, B, C, D or 
E) (in which case the absence of the ligand stimulates target gene transcription), or more 
preferably, comprises a mutated tetR which comprises at least one amino acid substitution, 
addition or deletion compared to a wild-type tetR, especially those mutated tetR domains in 

25 which the presence of the ligand stimulates target gene transcription in a cell engineered in 
accordance with this invention. For example, mutated tetR domains include mutated TnlO- 
derived tetR domains having an amino acid substitution at one or more of amino acid 
positions 71 , 95, 1 01 and 1 02. By way of further illustration, one mutated tetR comprises 
amino acids 1 - 207 of the TnIO tetR in which glutamic acid 71 is changed to lysine, aspartic 

30 acid 95 is changed to asparagine, leucine 101 is changed to serine and glycine 102 is 
changed to aspartic acid. Ligands include tetracycline and a wide variety of analogs and 
mimics of tetracycline, including for example, anhydrotetracycline and doxycycline. Target 
gene constructs in these embodiments contain a target gene operably linked to a transcription 
control sequence including one or more copies of a DNA sequence recognized by the tetR of 

35 interest, including for example, an upstream activator sequence for the appropriate tet 
operator. See e.g. US Patent No. 5,654,168, the full contents of which are expressly 
incorporated by reference. 
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A wide variety of DNA binding domains may be used in the practice of this invention, 
including a domain selected or derived from a GAL4, lexA or composite (e.g. ZFHD1 ) DNA 
binding domain, or a DNA binding domain, e.g., in combination with ligand binding domains 
such as a wt or mutated progesterone receptor domain. TetR domains are discussed in the 
5 context of ligand binding domains. In many applications it is preferable to use a DNA binding 
domain which is heterologous to the cells to be engineered. Heterologous DNA binding 
domains include those which occur naturally in cell types other than the cells to be 
engineered as well as composite DNA binding domains containing component portions which 
are not found in the same continuous polypeptide or gene in nature, at least not in the same 

10 order or orientation or with the same spacing present in the composite domain. In the case of 
composite DNA binding domains, component peptide portions which are endogenous to the 
cells or organism to be engineered are generally preferred. 

In the case of the chimeric transcription factors containing a tetR domain, the DNA 
binding domain is typically provided by the tetR component, and is by its nature 

15 heterologous to eukaryotic cells. TetR domains are discussed in further detail in the context of 
ligand binding domains. 

In embodiments in which an endogenous gene is to be regulatably expressed, a 
composite DNA binding domain which is selected for recognition of one or more sequences 
upstream of the target gene may be deployed. 

20 Additional information concerning DNA binding domains is provided below. 

Typically it will be preferred that, to the extent available, each of the various domains 
incorporated into the design of the chimeric transcription factor contain a peptide sequence 
which is or is derived from a peptide sequence which naturally occurs in the cells or organism 
in which the chimeric transcription factor is to be expressed. Thus, human sequences or 

25 sequences derived therefrom are preferred for chimeric transcription factors for use in humans. 
In some cases, such as the ecdysone or tetR cases, the inclusion of non-human sequence 
may be unavoidable. The encoded chimeric transcription factor may further contain one or 
more additional domains such as a transcription potentiating domain. 

Specific examples of such chimeric transcription factors include fusion proteins 

30 containing at least: 

(a) a p65 domain and a ligand binding domain containing a peptide sequence 
selected from within, or derived from, an FKBP, FRB or cyclophilin domain (these will 
typically be paired with a second fusion protein comprising a DNA binding domain 
35 and at least one ligand binding domain which, in the presence of a divalent ligand, 

forms a ligand-dependent (cross-linked) complex with the p65 fusion protein and 
activates transcription of a target gene operably linked to a transcription control 
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sequence containing one or nnore recognition sequences for the DNA binding domain); 



(b) a p65 domain, a DNA binding domain (e.g. GAL4 or ZFHD1 ) and a ligand binding 
5 domain containing peptide sequence selected from within, or derived from, a hormone 

receptor such as a progesterone receptor domain (see e.g. WO 93/23431 and WO 
98/18925) or ecdysone receptor (see e.g., WO 97/381 17 and WO 96/37609); and, 

(c) a p65 domain and a tetR domain, or domain derived therefrom, which binds to a 
10 characteristic DNA sequence in a ligand-dependent manner (see e.g., US Patent No. 

5,650,298 and 5,654,168). 

The chimeric transcription factors may further include one or more optional domains, 
including for example, one or more transcription potentiating domains, and the p65 domain 
15 may in fact be replaced with a p65-containing composite transcription activation domain as 
described above. 

The recombinant nucleic acid encoding the chimeric transcription factor may be 
operably linked to a transcription control sequence permitting expression of the chimeric 
transcription factor in cells. Such recombinant nucleic acid constructs may be contained within 

20 any of a variety of DNA vectors for use in transfecting prokaryotic or eukaryotic cells. A target 
gene construct may be included in the same vector or may be provided in an additional 
vector. The recombinant nucleic acid encoding the chimeric transcription factor and optionally 
a target gene construct may be present within one or more recombinant viruses for delivery 
(by infection) to cells in vitro or in vivo (i.e., by administration of recombinant virus to the 

25 whole organism). Conventional techniques may be used to prepare recombinant viruses 
harboring the recombinant nucleic acids of this invention. Adenoviruses, adeno-associated 
viruses, hybrid adeno-AAV, retroviruses and lentiviruses are of particular interest at present. 

Compositions containing a recombinant nucleic acid encoding a chimeric transcription 
factor together with a target gene construct may be included in a kit or package for delivery to 

30 researchers, hospitals, physicians or veterinarians. In some cases a "universal" target gene 
construct may be included in which the target gene is replaced with a cloning site for insertion 
by the practitioner of any desired coding sequence. Such compositions or kits which are 
designed for regulated expression may further include a sample of ligand for activating target 
gene transcription. It should also be noted that the various nucleic acids may be present in 

35 vectors, recombinant viruses, etc. as described elsewhere. 

A recombinant nucleic acid encoding a chimeric transcription factor of this invention 
may be used to transduce a cell to render it capable of expressing a target gene in a ligand- 
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dependent manner. The chimeric transcription factor is chosen which is capable of stimulating, 
in a ligand-dependent manner, the transcription of a target gene operably linked to a 
transcription control sequence recognized by the chimeric transcription factor. A target gene 
construct comprising a desired target gene operably linked to a transcription control sequence 
5 which is recognized by the chimeric transcription factor nay be transduced into the cell as 
well, and may be included in the same or a different vector or recombinant virus for this 
purpose (as the recombinant nucleic acid encoding the chimeric transcription factor). 

In certain applications, cells are so transduced in vitro. In other applications cells are 
transduced while present within an organism, generally a human or non-human mammal. 

10 Cells containing a recombinant nucleic acid encoding a chimeric transcription factor of 

this invention are useful in a variety of applications as mentioned above, especially cells 
which further comprise a target gene operably linked to a transcription control sequence 
which is responsive to the chimeric transcription factor in the presence of a ligand. In order to 
stimulate transcription of the target gene in such cells, one exposes the cells to a ligand which 

15 binds to the chimeric transcription factor. This may be conveniently effected by simply 
adding the ligand to the culture medium, in an effective amount to yield the desired level of 
transcription. 

Examples of such cells include the following: 

(1 ) A cell containing (a) a recombinant nucleic acid encoding a chimeric transcription factor 
20 which comprises a p65 domain, a DNA binding domain and a ligand binding domain 

comprising or derived from a progesterone receptor domain, and (b) a target gene construct 
which comprises a target gene operably linked to a transcription control sequence which 
contains one or more copies of a DNA sequence recognized by the DNA binding domain of 
the chimeric transcription factor, the cell being capable of expressing its target gene in a 
25 ligand-dependent manner, the ligand being progesterone or an analog or mimic thereof. 

(2) A cell containing (a) a recombinant nucleic acid encoding a chimeric transcription factor 
which comprises a p65 domain and a tetR domain which binds to a recognized DNA 
sequence in the presence of its ligand, and (b) a target gene construct which comprises a 

30 target gene operably linked to a transcription control sequence which contains one or more 
copies of a DNA sequence recognized by the tetR domain of the chimeric transcription factor, 
the cell being capable of expressing its target gene in a ligand-dependent manner, the ligand 
being tetracycline, doxycycline or an analog or mimic thereof. 

35 (3) A cell containing (a) a recombinant nucleic acid encoding a chimeric transcription factor 
which comprises a p65 domain and an ecdysone receptor domain capable of binding to a 
DNA binding protein comprising or derived from the peptide sequence of an RXR protein, and 
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(b) a target gene construct which comprises a target gene operably linked to a transcription 
control sequence which contains one or more copies of a DNA sequence recognized by the 
RXR, the cell being capable of expressing its target gene in a ligand-dependent manner, the 
ligand being ecdysone or an analog or mimic thereof. 

5 

Cells so engineered to express a chimeric transcription factor of this invention and a 
corresponding target gene construct responsive to such factor in a ligand-dependent manner 
may be introduced into the host organism, thereby rendering the organism capable of 
regulated expression of a target gene. A non-human organism containing one or more such 

10 cells can be used in research to study the effect of regulated expression of a target gene of 
possible interest. Such animals may also be used as model systems for the study of various 
diseases and for the evaluation of drug candidates for treating such diseases. 

Alternatively, the various recombinant nucleic acids may be introduced directly into 
the organism to transduce cells in vivo and render a host organism capable of regulated 

15 expression of a target gene. Of the various methods for gene delivery, use of one or more 
recombinant viruses containing the recombinant nucleic acids is currently preferred. 
Particularly important applications of this methodology involve the use of human subjects as 
the host organism. To stimulate transcription of a target gene in an organism containing cells 
transduced with the appropriate recombinant nucleic acids, one administers to the organism, 

20 by any acceptable means of administration, a ligand which binds to the chimeric transcription 
factor expressed in the cells, in an amount effective to yield the desired level of gene 
transcription. The ligand may be in the form of a pharmaceutically or veterinarily acceptable 
composition, delivered by any pharmaceutically or veterinarily acceptable route of 
administration. 

25 

Brief Description of the Figures 

Figure 1 demonstrates that in vivo administration of a ligand to animals into which engineered 
cells had been transplanted led to regulated gene expression and the production and 

30 secretion of the gene product. HT1 080 cells were transfected with DNA constructs encoding 
regulatable transcription factor components as described in the examples below. Transfected 
HT1080 cells (2 x 1 0^ total per animal, in four different sites) were injected intramuscularly 
into male nu/nu mice. Approximately one hour later, animals received the indicated 
concentration of intravenous rapamycin. Blood samples were collected 17 hours after 

35 rapamycin administration and assayed for hGH concentration. Rapamcyin treatment 

produced a dose-dependent increase in serum hGH (X ± SEM; n = at least 5 at each dose). * 
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represent statistical significance from each lower rapamycin dose and t represents statistical 
significance from rapamycin doses which are 10-fold and more lower (p < 0.05, one-way 
analysts of variance and Tukey-Kramer multiple comparison testing). 

5 Figs 2 through 7 present comparative data on a representative collection of chimeric 

transcription factors assayed in cell lines into which target gene constructs (SEAP) had been 
stably integrated as described in the examples which follow below. 

Figure 8 shows the potency of the transcription factor S3H (p65(281 -551 )-HSF(406-530)) in 
10 an in vitro transcription assay. 

Detailed Description of the Invention 
15 Definitions 

For convenience, the intended meaning of certain terms and phrases used herein are 
provided below. 

"Activate" as applied to the expression or transcription of a gene denotes a directly 
or indirectly observable increase in the production of a gene product, e.g., an RNA or 

20 polypeptide encoded by the gene. 

"Capable of selectively hybridizing" as that phrase is used herein means that 
two DNA molecules are susceptible to hybridization with one another, despite the presence 
of other DNA molecules, under hybridization conditions which can be chosen or readily 
determined empirically by the practitioner of ordinary skill in this art. Such treatments include 

25 conditions of high stringency such as washing extensively with buffers containing 0.2 to 6 x 
SSC, and/or containing 0.1% to 1% SDS, at temperatures ranging from room temperature to 
65-75°C. See for example F.M. Ausubel et al., Eds, Short Protocols in Molecular Biology, 
Units 6.3 and 6.4 (John Wiley and Sons, New York, 3d Edition, 1995). 

"Cells", "host cells" and "genetically engineered cells" refer not only to the 

30 particular subject cells but to the progeny or potential progeny of such a cell. Because certain 
modifications may occur in succeeding generations due to either mutation or environmental 
influences, such progeny may not, in fact, be identical to the parent cell, but are still included 
within the scope of the term as used herein. 

"Cell line" refers to a population of cells capable of continuous or prolonged growth 

35 and division in vitro. Often, cell lines are clonal populations derived from a single progenitor 
cell. It is further known in the art that spontaneous or induced changes can occur in karyotype 
during storage or transfer of such clonal populations. Therefore, cells derived from the cell line 
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referred to may not be precisely identical to the ancestral cells or cultures, and the cell line 
referred to includes such variants. 

"Composite", "fusion", "chimeric" and "recombinant" denote a nnaterial such 
as a nucleic acid, nucleic acid sequence or polypeptide which contains at least two 
5 constituent portions which are mutually heterologous in the sense that they are not othenA/ise 
found directly (covalently) linked in nature, e.g. are not found in the same continuous 
polypeptide or gene in nature, at least not in the same order or orientation or with the same 
spacing present in the composite, fusion or recombinant product. Such materials contain 
components derived from at least two different proteins or genes or from at least two non- 

10 adjacent portions of the same protein or gene. In general, "composite" refers to portions of 
different proteins or nucleic acids which are joined together to form a single functional unit, 
while 'lusion" generally refers to two or more functional units which are linked together. 
"Recombinant" is generally used in the context of nucleic acids or nucleic acid sequences. 
A "coding sequence" or a sequence which "encodes" a particular polypeptide or 

15 RNA, is a nucleic acid sequence which is transcribed (in the case of DNA) and translated (in 
the case of mRNA) into a polypeptide in vitro or in vivo when placed under the control of an 
appropriate expression control sequence. The boundaries of the coding sequence are 
determined by a start codon at the 5' (amino) terminus and a translation stop codon at the 3' 
(carboxy) terminus. A coding sequence can include, but is not limited to, cDNA from 

20 procaryotic or eukaryotic mRNA, genomic DNA sequences from procaryotic or eukaryotic 
DNA, and even synthetic DNA sequences. A transcription termination sequence will usually 
be located 3' to the coding sequence. 

A "construct", e.g., a "nucleic acid construct' or "DNA construcf refers to a nucleic 
acid or nucleic acid sequence. 

25 "Derived from" indicates a peptide or nucleotide sequence selected from within a 

given sequence. A peptide or nucleotide sequence derived from a named sequence may 
contain a small number of modifications relative to the parent sequence, in most cases 
representing deletion, replacement or insertion of less than about 15%, preferably less than 
about 10%, and in many cases less than about 5%, of amino acid residues or base pairs 

30 present in the parent sequence. In the case of DNAs, one DNA molecule is also considered 
to be derived from another if the two are capable of selectively hybridizing to one another. 
Typically, a derived peptide sequence will differ from a parent sequence by the replacement 
of up to 5 amino acids, in many cases up to 3 amino acids, and very often by 0 or 1 amino 
acids. Correspondingly, a derived nucleic acid sequence will differ from a parent sequence 

35 by the replacement of up to 15 bases, in many cases up to 9 bases, and very often by 0 - 
3 bases. In some cases the amino acid(s) or base(s) is/are deleted rather than replaced. 
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"Divalent", as that term is applied to ligands in this document, denotes a ligand 
which is capable of complexing with at least two protein molecules which contain ligand 
binding domains, to form a three (or greater number)-component complex. 

"Domain" refers to a portion of a protein or polypeptide. In the art, domain may refer 
5 to a discrete 2° structure. However, as will be apparent from the context used herein, the 
term "domain" is not intended to be limited to a discrete folding domain. Rather, consideration 
of a polypeptide sequence as a "domain" in, e.g., a fusion protein herein, can be made 
simply by the observation that the polypeptide has a specific activity, function or source. 
Most domains described herein can be derived from proteins ranging from naturally occurring 
10 proteins to completely artificial sequences. 

A "DNA binding domain" refers to a polypeptide which interacts, or binds, with a 
higher affinity to a nucleic acid having a particular nucleotide sequence relative to a nucleic 
acid having a different nucleotide sequence. 

"DNA recognition sequence" means a DNA sequence which is capable of 
binding to one or more DNA-binding domains, e.g., of a transcription factor or an engineered 
polypeptide. 

"Endogenous" refers to molecules which are naturally occurring in a cell, i.e. prior to 
the genetic engineering or infection of the cell. 

"Exogenous" refers to molecules which are not naturally present in the cell, and 
which have been, e.g., introduced by transfection or transduction of the cell (or the parent cell 
thereof). 

"Gene" refers to a nucleic acid molecule or sequence comprising an open reading 
frame and including at least one exon and (optionally) an intron sequence. The term "intron" 
refers to a DNA sequence present in a given gene which is not translated into protein and is 

25 generally found between exons. 

"Genetically engineered cells" denotes cells which have been modified by the 
introduction of recombinant or heterologous nucleic acids (e.g. one or more DNA constructs or 
their RNA counterparts) and further includes the progeny of such cells which retain part or all 

of such genetic modification. 

"Heterologous" as it relates to nucleic acid sequences such as coding sequences 
and control sequences, denotes sequences that are not normally joined together, and/or are 
not normally associated with a particular cell. Thus, a "heterologous" region of a nucleic acid 
construct is a segment of nucleic acid within or attached to another nucleic acid molecule that 
is not found in association with the other molecule in nature. For example, a heterologous 
region of a construct could include a coding sequence flanked by sequences not found in 
association with the coding sequence in nature. Another example of a heterologous coding 
sequence is a construct where the coding sequence itself is not found in nature (e.g.. 
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synthetic sequences having codons different from the native gene). Similarly, in the case of 
a cell transduced with a nucleic acid construct which is not normally present in the cell, the cell 
and the construct would be considered mutually heterologous for purposes of this invention. 
Allelic variation or naturally occurring mutational events do not give rise to heterologous DNA, 
5 as used herein. 

"Interact" as used herein is meant to include detectable interactions between 
molecules, such as can be detected using, for example, a yeast two hybrid assay or by 
immunoprecipitation. The term interact is also meant to include "binding" interactions between 
molecules. Interactions may be, for example, protein-protein, protein-nucleic acid, protein- 
ic small molecule or small molecule-nucleic acid in nature. 

"Ligand" refers to any molecule which is capable of interacting with a corresponding 
protein or protein domain. A ligand can be naturally occurring, or the ligand can be partially or 
wholly synthetic. The term "modified ligand" refers to a ligand which has been modified such 
that it does not significantly interact with the naturally occurring receptor of the ligand in its non 
15 modified form. Ligands may be formulated and administered to cells or human or non-human 
animals as disclosed in the various patent documents cited herein. 

A "ligand binding domain" is a domain which binds to a ligand or analogs or mimics 
thereof with measurable preference to binding to other materials. For the purpose of this 
document, DNA is not a ligand, and a DNA binding domain is not a ligand binding domain. 
20 "Minimal promoter" refers to the minimal expression control sequence that is 

necessary for initiating transcription of a selected DNA sequence to which it is operably 
linked. For example, the term "minimal promoter" may be used to refer to a DNA sequence 
which is derived from a regulatory region upstream of a gene, contains a TATA box flanked 
upstream by about 20-30 base pairs and on its 3' end by -100-300 bp, and which has little 
25 or no basal promoter activity, i.e., less than about 1 % of the promoter activity observed with 
the full length regulatory region as determined by any measure of transcriptional activity. 

The terms "promoter" and "transcription control sequence" further encompass 
"tissue specific" promoters and expression control sequences, i.e., promoters and 
expression control sequences which effect expression of the selected DNA sequence 
30 preferentially in specific cells (e.g., cells of a specific tissue). Gene expression occurs 
preferentially in a specific cell if expression in this cell type is significantly higher than 
expression in other cell types. The terms "promoter" and " expression control sequence" 
also encompass so-called "leaky" promoters and " expression control sequences", which 
regulate expression of a selected DNA primarily in one tissue, but cause expression in other 
35 tissues as well. These terms also encompass non-tissue specific promoters and 

expression control sequences which are active in most cell types. Furthermore, a promoter or 
expression control sequence can be constitutive i.e. one which is active basally or inducible, 
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i.e., a promoter or expression control sequence which is active primarily in response to a 
stimulus. A stimulus can be, e.g., a molecule, such as a hormone, a cytokine, a heavy metal, 
phorbol esters, cyclic AMP (cAMP), or retinoic acid. 

"Nucleic acid" refers to polynucleotides such as deoxyribonucleic acid (DNA), and, 

5 where appropriate, ribonucleic acid (RNA). The term should also be understood to include, 
as equivalents, derivatives, variants and analogs of either RNA or DNA made from nucleotide 
analogs, and, as applicable to the embodiment being described, single (sense or antisense) 
and double-stranded polynucleotides. 

"Oligomerization" and "multimerization", used interchangeably herein, refer to the 

10 association of two or more proteins which can be constitutive or inducible. Inducible 
oligomerization is mediated, in the practice of this invention, by the binding of each such 
protein to a common ligand. "Dimerization" refers to the association of two proteins. The 
formation of a tripartite (or greater) complex comprising proteins containing one or more FKBP 
domains together with one or more molecules of an FKBP ligand which is at least divalent 

15 (e.g. FK1012 or AP1510) is an example of such association or clustering. In cases where at 
least one of the proteins contains more than one ligand binding domain, e.g., whereat least 
one of the proteins contains three FKBP domains, the presence of a divalent ligand leads to 
the clustering of more than two protein molecules. Embodiments in which the ligand is more 
than divalent (e.g. trivalent) in its ability to bind to proteins bearing ligand binding domains 

20 also can result in clustering of more than two protein molecules. The formation of a tripartite 
complex comprising a protein containing at least one FRB domain, a protein containing at 
least one FKBP domain and a molecule of rapamycin is another example of such protein 
clustering. In certain embodiments of this invention, fusion proteins contain multiple FRB 
and/or FKBP domains. Complexes of such proteins may contain more than one molecule of 

25 rapamycin or a derivative thereof and more than one copy of one or more of the constituent 
proteins. Again, such multimeric complexes are still referred to herein as tripartite complexes 
to indicate the presence of the three types of constituent molecules, even if one or more are 
represented by multiple copies. The formation of complexes containing at least one divalent 
ligand and at least two molecules of a protein which contains at least one ligand binding 

30 domain may be referred to as"oligomerization" or "multimerization", or simply as"dimerization", 
"clustering" or association". 

"Operably linked" refers to an arrangement of elements wherein the components so 
described are configured so as to perform their usual function. Thus, a transcription control 
sequence operably linked to a coding sequence permits expression of the coding sequence. 

35 The control sequence need not be contiguous with the coding sequence so long as it 

functions to direct the expression thereof. Thus, for example, intervening untranslated yet 
transcribed sequences can be present between a promoter sequence and the coding 
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sequence and the promoter sequence can still be considered "operably linked" to the coding 
sequence. 

"Protein", "polypeptide" and "peptide" are used interchangeably herein when 
referring to a gene product, e.g., as may be encoded by a coding sequence. 
5 A "recombinant virus" is a virus particle in which the packaged nucleic acid contains 

a heterologous portion. 

A "target gene" is a nucleic acid of interest, the expression of which is modulated 
according to the methods of the invention. The target gene can be endogenous or 
exogenous and can integrate into a cell's genome, or remain episomal. The target gene can 

10 encode a protein or be a non coding nucleic acid, e.g, a nucleic acid which is transcribed into 
an antisense RNA or a ribozyme. 

"Transcription factor" refers to any protein or modified form thereof that is involved 
in the initiation of transcription but which is not itself a part of the polymerase. Transcription 
factors are proteins or modified forms thereof, which interact preferentially with specific nucleic 

15 acid sequences, i.e., regulatory elements. Some transcription factors are active when they 
are in the form of a monomer. Alternatively, other transcription factors are active in the form of 
oligomers consisting of two or more identical proteins or different proteins (heterodimer). The 
factors have different actions during the transcription initiation: they may interact with other 
factors, with the RNA polymerase, with the entire complex, with activators, or with DNA. 

20 Transcription factors usually contain one or more transcription regulatory domains. 

'Transcription activation motifs" as that phrase is used herein means a peptide 
motif of usually at least 6 amino acid residues which is either a transcription potentiating motif 
(i.e., it need not have a naturally occurring peptide sequence) or it is associated with a 
transcription activation domain, including, as non-limiting examples, the well-known "acidic", 

25 "glutamine-rich" and "proline-rich" motifs such as the K13 motif from p65, the 0CT2 Q domain 
and the 0CT2 P domain, respectively. 

"Transfection" means the introduction of a naked nucleic acid molecule into a 
recipient cell. "Infection" refers to the process wherein a virus enters the cell in a manner 
whereby the genetic material of the virus can be expressed in the cell. A "productive 

30 infection" refers to the process wherein a virus enters the cell, is replicated, and then released 
from the cell (sometimes referred to as a "lytic" infection). "Transduction" encompasses the 
introduction of nucleic acid into cells by any means. 

"Transgene" refers to a nucleic acid sequence which has been introduced into a cell. 
Daughter cells deriving from a cell in which a transgene has been introduced are also said to 

35 contain the transgene (unless it has been deleted). A transgene can encode, e.g., a 

polypeptide, partly or entirely heterologous to the animal or cell into which it is introduced, or 
comprises or is derived from an endogenous gene of the animal or cell into which it is 
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introduced, but which is designed to be inserted, or is inserted, into the recipient's genome in 
such a way as to alter that genome, (e.g., it is inserted at a location which differs from that of 
the natural gene). Alternatively, a transgene can also be present in an episome. A 
transgene can include one or more expression control sequences and any other nucleic acid, 
5 (e.g. intron), that may be necessary or desirable for optimal expression of a selected coding 
sequence. 

"Transient transfection" refers to cases where exogenous DNA does not integrate 
into the genome of a transfected cell, e.g., where episomai DNA is transcribed into mRNA and 
translated into protein. A cell has been "stably transfected" with a nucleic acid construct 
10 when the nucleic acid construct has been integrated into the genome of that cell. 

"Wild-type" means naturally occurring in a normal cell. 

Components of the system and additional details 

1 5 The system, as employed in cells, comprises: (1 ) a recombinant nucleic acid 

construct encoding and capable of directing the expression of a chimeric transcription factor 
protein as described above; and, (2) a target gene construct containing a target gene and a 
transcription control sequence permitting transcription of the target gene under the direction of 
the chimeric transcription factor and in a ligand-dependent manner. The transcription control 

20 sequence comprises a DNA promoter sequence and one or more copies of a DNA recognition 
sequence to which the transcription factor is capable of binding ("recognizing"). In 
embodiments in which the chimeric transcription factor does not include a DNA binding 
domain, a second recombinant nucleic acid is included which encodes an accessory chimeric 
protein comprising at least one DNA binding domain and a ligand binding domain permitting 

25 ligand-dependent crosslinking with the chimeric transcription factor protein to form a two- 
hybrid-type chimeric transcription factor complex, capable of recognizing the target gene's 
transcription control sequence and stimulating transcription of the target gene in a ligand- 
dependent manner. 

30 

1. Transcription Activation Domains 

Chimeric transcription factors of this invention contain one or more copies of one or 
more p65 domains, one or more ligand binding domains, and optionally one or more copies of 
one or more transcription activation or potentiating domains. Such additional activation 
35 domains may be selected from peptide sequences of naturally occurring transcription factors 
such as the widely used transcription activation domain of Herpes Simplex Virus \/P1 6 or 
human heat shock factor, may be derived from such sequences or may comprise a composite 
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transcription activation region. A composite transcription activation region consists of a 
continuous polypeptide region containing two or more reiterated or mutually heterologous 
component polypeptide portions. The component polypeptide portions comprise 
polypeptide sequences derived from at least two different proteins, polypeptide sequences 
5 from at least two non-adjacent portions of the same protein, polypeptide sequences which 
are not found so linked in nature (including reiterated copies of a polypeptide sequence) or 
non-naturally occurring peptide sequence. Preferably the activation domain or component 
peptide sequences thereof are selected or derived from peptide sequences endogenous to 
the cells or organism to be engineered. 
10 NF-kb p65, preferably human, turned out to be an important source of transcription 

activation domains and motifs. p65(450-550) is a known transcription activation domain 
although methods and materials for using it as described herein have not been previously 
reported. We have found that extending the p65 peptide sequence to include sequence 
spanning p65 residues 361-450 leads to an unexpected increase in transcription activation. 
Moreover, a peptide sequence comprising all or a portion of p65(361 -550), or peptide 
sequence derived therefrom, in combination with heterologous activation motifs, can yield 
surprising additional increases in the level of transcription activation. Incorporation of 
additional residues from p65, including those from 321-360 or 281-360, allows an even 
greater increase in transcription levels in certain systems. p65-based activation domains 
function across a broad range of promoters and have yielded increases in transcription levels 
six-fold, eight-fold and even 1 4-1 5-fold higher than obtained with tandem copies of VP1 6 
which itself is widely recognized as a very potent activation domain. 

While the resultant increases in activation potency are dramatic, p65-based 
transcription factors possess additional and unexpected characteristics. For instance, our 
25 p65-based activators appear to be less toxic to the engineered cells in comparison to VP1 6- 
based activators. This is clearly of profound practical significance in many applications. It is 
expected that recombinant DNA molecules encoding chimeric proteins which contain a 
peptide sequence comprising all or a portion of p65(361 -550), p65(321 -550) or p65(281 - 
550), especially containing one or more portions of the sequence spanning residues 361 and 
30 450, or peptide sequence derived therefrom, will provide significant advantages for 
heterologous gene expression in its various contexts. 

One class of p65-based chimeric transcription factors contain more than one copy of 
a p65-derived domain. Such proteins will typically contain two to six copies of a peptide 
sequence comprising all or a portion of p65(361-550), or peptide sequence derived 
35 therefrom. Such transcription factors may contain one or more ligand-binding domains to 
provide for regulation as described elsewhere herein, and in some cases additional other 
domains. 
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Chimeric transcription factors of this invention may contain, in addition to one or more 
copies of a p65 activation domain such as described above, one or more copies of one or 
more heterologous peptide sequences which potentiate the transcription activation potency 
of the transcription factor, as measured by any means. Inclusion of such motifs, including the 
5 so-called "glutamine-rich", "proline-rich" and "acidic" transcription activation motifs, in 
combination with a primary activation domain can result in extremely high levels of 
transcription. 

A wide variety of transcription activation domains and motifs can be used in the 
practice of the present invention in conjunction with p65-based domains. Polypeptides 
10 which can function to activate transcription in eukaryotic cells are well known in the art. In 
particular, transcription activation domains have been described for many transcription factors 
and have been shown to retain their activation function when the transcription activation 
domain, or a suitable fragment thereof, is present within a fusion protein. 

Activation domains can comprise naturally occurring or non-naturally occurring peptide 
1 5 sequences, so long as, either alone or in combination with other activation domains, they are 
capable of activating transcription. Any particular activation domain is preferably at least 6 
amino acids in length. Naturally occurring activation domain subunits or motifs include 
portions of transcription factors, such as a thirty amino acid fragment of the C-terminus of 
VP1 6 (amino acids 461 -490), referred to herein as "Vc". Other activation domain subunits are 
20 derivatives of naturally occurring peptides. For example, the replacement of one amino acid 
of a naturally occurring activation unit by another may further increase activation. An 
example of such an activation unit is a derivative of an eight amino acid peptide of VP16, the 
derivative having the amino acid sequence DFDLDr\/ILG. 

Yet other activation units are entirely synthetic. It is known, for example, that certain 
25 random alignments of acidic amino acids are capable of activating transcription. 

It is well known in the art that certain transcription factors are active only in specific 
cell types. By using tissue specific activation domains, it is possible to design a transcription 
factor having a certain tissue specificity. 

One source of polypeptide motifs for use in conjunction with p65-based activation 
30 domains is the herpes simplex virus virion protein 1 6 (referred to herein as VP1 6, the amino 
acid sequence of which is disclosed in Triezenberg, S.J. et al. (1988) Genes Dev. 2:718- 
729). In one embodiment, an activation domain corresponding to about 127 of the C-terminal 
amino acids of VP1 6 is used. For example, a polypeptide having amino acid residues 208- 
335 can be used as an auxiliary activation domain. In another embodiment, at least one 
35 copy of about 1 1 amino acids from the C-terminal region of VP1 6 which retain transcription 
activation ability is used as an additional activation domain. Preferably, an oligomer of this 
region (i.e., about 22 amino acids) is used. Suitable C-terminal peptide portions of VP16 are 
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described in Seipel, K. et al. (ElVlBO J. (1 992) 1 3:4961 -4968). VP1 6-derived transcription 
activation domains have been used successfully in many of tfie different regulated 
expression systems referred to herein. 

Another example of an acidic activation domain is provided in residues 753-881 of 

5 GAL4. A preferred activation domain to be used in conjunction with a p65 domain is a domain 
from a human heat shock factor transcriptional activator (HSF domain), in particular, residues 
406-530. Heat shock factor activation domains are more fully described in USSN 
09/262,600, filed 3/4/99, the full contents of which are incorporated herein by reference. 

Other illustrative activation domains and motifs of human origin include the activation 

10 domain of human CTF, the 1 8 amino acid (NFLQLPQQTQGALLTSQP) glutamine rich region 
of Oct-2, the N-terminal 72 amino acids of p53, the SYGQQS repeat in Ewing sarcoma gene 
and an 1 1 amino acid (535-545) acidic rich region of Rel A protein. Various additional 
activation domains, motifs and chimeric transcription factors are provided in the examples 
which follow. See also USSN 08/920,610 (ARIAD 363-A), the contents of which are 

15 incorporated herein by reference, especially for additional information concerning sources of 
activation domains and motifs that may be used in combination with p65 domains in the 
chimeric transcription domains of this invention. 

20 2. Ligand binding domains 

The chimeric transcription factors contain at least one p65 domain and one ligand 
binding domain, but function, in the various embodiments, through different molecular 
mechanisms. 

25 A. Dimerization-based systems 

In certain embodiments, the ligand binding domain permits ligand-mediated cross- 
linking of the chimeric transcription factor with a second fusion protein (which contains at least 
one ligand binding domain and DNA binding domain). In these cases, the ligand is at least 
divalent and functions as a dimerizing agent by binding to the two fusion proteins and 

30 forming a cross-linked heterodimeric complex which activates target target gene expression. 
See e.g. WO 94/18317, WO 96/20951 , WO 96/06097, WO 97/31898, WO 96/41865, and 
POT US98/17723, the contents of which are incorporated herein by reference. 

In other embodiments, the ligand binding event is thought to result in an allosteric 
change in the chimeric transcription factor leading to binding of the fusion protein to a target 

35 DNA sequence [see e.g. US 5,654,1 68 and 5,650,298 (tet systems), and WO 93/23431 and 
WO 98/18925 (RU486-based systems)] or to another protein [see e.g. WO 96/37609 and 
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wo 97/381 17 (ecdysone/RXR-based systems)], in either case, modulating target gene 
expression. 

In the cross-linking-based dimerization systems the fusion proteins can contain one or 
more ligand binding domains (in some cases containing two, three or four such domains) and 

5 can further contain one or more additional domains, heterologous thereto, including e.g. a DNA 
binding domain, transcription activation domain, etc. 

In general, any ligand/ligand binding domain pair may be used in such systems. For 
example, ligand binding domains may be derived from an immunophilin such as an FKBP, 
cyclophilin, FRB domain, hormone receptor protein, antibody, etc., so long as a ligand is 

10 known or can be identified for the ligand binding domain. 

For the most part, the receptor domains will be at least about 50 amino acids, and 
fewer than about 350 amino acids, usually fewer than 200 amino acids, either as the natural 
domain or truncated active portion thereof. Preferably the binding domain will be small (<25 
kDa, to allow efficient transfection in viral vectors), monomeric, nonimmunogenic, and should 

15 have synthetically accessible, cell permeant, nontoxic ligands as described above. 

Preferably the ligand binding domain is for (i.e.. binds to) a ligand which is not itself a 
gene product (i.e., is not a protein), has a molecular weight of less than about 5 kD and 
preferably less than about 3 kD, and is cell permeant. In many cases it will be preferred that 
the ligand does not have an intrinsic pharmacologic activity or toxicity which interferes with its 

20 use as a transcription regulator. 

The DNA sequence encoding the ligand binding domain can be subjected to 
mutagenesis for a variety of reasons. The mutagenized ligand binding domain can provide 
for higher binding affinity, allow for discrimination by a ligand between the mutant and 
naturally occurring forms of the ligand binding domain, provide opportunities to design ligand- 

25 ligand binding domain pairs, or the like. The change in the ligand binding domain can involve 
directed changes in amino acids known to be involved in ligand binding or with ligand- 
dependent conformational changes. Alternatively, one may employ random mutagenesis 
using combinatorial techniques. In either event, the mutant ligand binding domain can be 
expressed in an appropriate prokaryotic or eukaryotic host and then screened for desired 

30 ligand binding or conformational properties. Examples involving FKBP, cyclophilin and FRB 
domains are disclosed in detail in WO 94/18317, WO 96/06097, WO 97/31898 and WO 
96/41 865). Illustrative of this situation is to modify FKBP1 2's Phe36 to Ala and/or Asp37 to 
Gly or Ala to accommodate a substituent at positions 9 or 10 of FK506 or FK520. In particular, 
mutant FKBP12 moieties which contain Val, Ala, Gly, Met or other small amino acids in place 

35 of one or more of Tyr26, Phe36, Asp37, Tyr82 and Phe99 are of particular interest as 
receptor domains for FK506-type and FK-520-type ligands containing modifications at C9 
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and/or CIO. Illustrative mutations of current interest in FKBP domains also include the 
following: 
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F36A 


Y26V 


F46A 


W59A 


F36V 


Y26S 


F48H 


H87W 


F36M 


D37A 


F48L 


H87R 


F36S 


I90A 


F48A 


F36V/F99A 


F99A 


191 A 


E54A/F36V/F99G 


F99G 


F46H 


E54K/F36M/F99A 


Y26A 


F46L 


V55A 


F36M/F99G 







Table 1 ■ Entries identify the native amino acid by single letter code and sequence position, 
followed by the replacement amino acid in the mutant. Thus, F36V designates a human 
FKBP12 sequence in which phenylalanine at position 36 is replaced by valine. F36V/F99A 
indicates a double mutation in which phenylalanine at positions 36 and 99 are replaced by 
valine and alanine, respectively. 

illustrative examples of rapamycin-binding domains are those which include an 
approximately 89-amino acid rapamycin-binding domain from FRAP, e.g., containing residues 
2025-21 1 3 of human FRAP. Another preferred portion of FRAP is a 93 amino acid fragment 
consisting of amino acids 2021 -21 1 3. Similar considerations apply to the generation of 
mutant FRAP-derived domains which bind preferentially to rapamycin analogs (rapalogs) 
containing modifications (i.e., are 'bumped') relative to rapamycin in the FRAP-binding effector 
domain. For example, one may obtain preferential binding using rapalogs bearing 
substituents other than -OMe at the C7 position with FRBs based on the human FRAP FRB 
peptide sequence but bearing amino acid substitutions for one of more of the residues 
Tyr2038, Phe2039, Thr2098, Gln2099, Trp2101 and Asp2102. Exemplary mutations include 
Y2038H,' Y2038L, Y2038V, Y2038A, F2039H, F2039L, F2039A, F2039V, D2102A, T2098A, 
T2098N, T2098L, and T2098S. Rapalogs bearing substituents other than -OH at C28 
and/or substituents other than =0 at C30 may be used to obtain preferential binding to 
FRAP proteins bearing an amino acid substitution for Glu2032. Exemplary mutations include 
E2032A and E2032S. Proteins comprising an FRB containing one or more amino acid 
replacements at the foregoing positions, libraries of proteins or peptides randomized at those 
positions (i.e., containing various substituted amino acids at those residues), libraries 
randomizing the entire protein domain, or combinations of these sets of mutants are made 
using the procedures described above to identify mutant FRAPs that bind preferentially to 
bumped rapalogs. See, for example, USSN 09/012,097, the contents of which are 
incorporated herein by reference. 
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other macrolide binding domains useful in the present invention, including mutants 
thereof, are described in the art. See, for example, W096/41865, W096/13613, 
WO96/061 1 1 , WO96/061 10, WO96/06097, W096/12796, WO95/05389, WO95/02684, 
W094/18317, each of which is expressly incorporated by reference herein. 

5 The ability to employ in vitro mutagenesis or combinatorial modifications of sequences 

encoding proteins allows for the production of libraries of proteins which can be screened for 
binding affinity for different ligands. For example, one can totally randomize a sequence of 1 
to 5, 10 or more codons, at one or more sites in a DNA sequence encoding a binding protein, 
make an expression construct and introduce the expression construct into a unicellular 

10 microorganism, and develop a library. One can then screen the library for binding affinity to 
one or desirably a plurality of ligands. The best affinity sequences which are compatible 
with the cells into which they would be introduced can then be used as the ligand binding 
domain. The ligand would be screened with the host cells to be used to determine the level 
of binding of the ligand to endogenous proteins. A binding profile could be defined weighting 

1 5 the ratio of binding affinity to the mutagenized binding domain with the binding affinity to 

endogenous proteins. Those ligands which have the best binding profile could then be used 
as the ligand. Phage display techniques, as a non-limiting example, can be used in carrying 
out the foregoing. 

In other embodiments, antibody subunits, e.g. heavy or light chain, particularly 

20 fragments, more particularly all or part of the variable region, or fusions of heavy and light 
chain to create single chain antibodies, can be used as the ligand binding domain. Antibodies 
can be prepared against haptenic molecules which are physiologically acceptable and the 
individual antibody subunits screened for binding affinity. The cDNA encoding the subunits 
can be isolated and modified by deletion of the constant region, portions of the variable 

25 region, mutagenesis of the variable region, or the like, to obtain a binding protein domain that 
has the appropriate affinity for the ligand. In this way, almost any physiologically 
acceptable haptenic compound can be employed as the ligand or to provide an epitope for 
the ligand. Instead of antibody units, natural receptors can be employed, where the binding 
domain is known and there is a useful ligand for binding. In yet another embodiment of the 

30 invention, the DNA binding unit is linked to more than one ligand binding domain. For 

example, a DNA binding domain can be linked to at least 2, 3, 4, or 5 ligand binding domains. 
A DNA binding domain can also be linked to at least 5 ligand binding domains or any number 
of ligand binding domains. In such embodiments, the ligand binding domains can be, by 
illustration, linked to each other in a linear array, by linking the NH2-terminus of one ligand 

35 binding domain to the COOH-terminus of another ligand binding domain. Thus, more than 
one molecule of a chimeric transcription factor can be cross-linked to a single DNA binding 
domain in the presence of a divalent ligand. 
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B. Allostery-based systems 

As mentioned previously, ligand-dependent transcription regulation switches based 
on allosteric changes in a chinneric transcription factor are also useful in practicing the subject 
5 invention. One such switch employs a deletion mutant of the human progesterone receptor 
which no longer binds progesterone or any known endogenous steroid but can be activated 
by the orally active progesterone antagonist RU486, described, e.g, in Wang et al. (1994) 
Proc. Natl. Acad. Sci. U.S.A. 91 :8180. The transcription factor in this system generally 
consists of a ligand binding domain for binding RU486, a DNA binding domain such as GAL4 

10 and an activation domain, typically VP16. Activation was demonstrated, e.g, in cells 
transplanted into mice using doses of RU486 (5-50 ^g/kg) considerably below the usual 
dose for inducing abortion in humans (10 mg/kg). However, according to the art describing 
this system, the induction ratio in culture and in animals was rather low. 

Another such system is referred to as the ecdysone inducible system. Early work 

15 demonstrated that fusing the Drosophila steroid ecdysone (Ec) receptor (EcR) Ec- binding 
domain to heterologous DNA binding and activation domains, such as E. coli lexA and 
herpesvirus VP16 permits ecdysone-dependent activation of target genes downstream of 
appropriate binding sites (Christopherson et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 
89:6314). An improved ecdysone regulation system has been developed, using the DNA 

20 binding domain of the EcR itself. In this system, the regulating transcription factor is provided 
as two proteins: (1 ) a truncated, mutant EcR fused to herpes VP1 6 and (2) the mammalian 
homolog (RXR) of Ultraspiracle protein (USP), which heterodimerizes with the EcR (No et al. 
(1996) Proc. Natl. Acad. Sci. U.S.A. 93:3346). In this system, because the DNA binding 
domain was also recognized by a human receptor (the human farnesoid X receptor), it was 

25 altered to a site recognized only by the mutant EcR. Thus, the invention provides an 

ecdysone inducible system, in which a truncated mutant EcR is fused to at least one subunit 
of a transcription activator of the invention. The transcription factor further comprises USP, 
thereby providing high level induction of transcription of a target gene having the EcR target 
sequence, dependent on the presence of ecdysone. 

30 In another embodiment, the inducible system comprises the E. coli tet repressor 

(TetR), which binds to tet operator (tetO) sequences upstream of target genes. In the 
presence of tetracycline, or an analog, which bind to tetR, DNA binding is abolished and thus 
transactivation is abolished. This system, in which the TetR had previously been linked to 
transcription activation domains, e.g, from VPie, is generally referred to as an allosteric "off- 

35 switch" described by Gossen and Bujard (Proc. Natl. Acad, Sci. U.S.A. (1992) 89:5547) and 
in U.S. Patents 5,464,758; 5,650,298; and 5,589,362 by Bujard et al. Furthermore, 
depending on the concentration of the antibiotic in the culture medium (0-1 mu g/ml), target 
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gene expression can be regulated over concentrations up to several orders of nnagnitude. 
Thus, the system not only allows differential control of the activity of an individual gene in 
euka^otic cells but also is suitable for creation of "on/off" situations for such genes in a 
reversible way. This system provides target gene expression in the absence of tetracycline 
or an analog. Thus, the invention described herein provides a method for obtaining even 
stronger transcription induction of a target gene, which is regulatable by the tetracycline 
system or other inducible DNA binding domain. 

In another embodiment, a "reverse" Tet system is used, again based on a DNA 
binding domain that is a mutant of the E. coli TetR, but which binds to TetO in the presence 
of Tet. Additional information on mutated tetR-based systems is provided above and in 
patent documents cited previously. Thus, the invention described herein provides a method 
for obtaining even stronger transcription induction of a target gene in the presence of 
tetracycline or an analog thereof from a very low background in the absence of tetracycline. 



15 3. Ligands of the invention 

In various embodiments where the a ligand binding domain for the ligand is 
endogenous to the cells to be engineered, it is desirable to use a ligand which preferentially 
binds to a modified ligand binding domain relative to a naturally occurring peptide sequence, 
e.g., from this the modified domain was derived. This approach can avoid untoward intrinsic 
20 activities of the ligand. Significant guidance and illustrative examples toward that end are 
provided in the various references cited herein. 

A. Cross-linking/dimerization systems 

Any ligand for which a binding protein or ligand binding domain is known or can be 
25 identified may be used in combination with such ligand binding domain in carrying out this 
invention. 

Extensive guidance and examples are provided in WO 94/18317 for ligands and 
other components useful for cross-linked oligomerization-based systems. Systems based on 
ligands for an immunophilin such as FKBP, a cyclophilin, and/or FRB domain are of special 

30 interest. Illustrative examples of ligand binding domain/ligand pairs that may be used for 
cross-linking include, but are not limited to: FKBP/FK1012 , FKBP/synthetic divalent FKBP 
ligands (see WO 96/06097 and WO 97/31898), FRB/rapamycin or analogs thereof.FKBP 
(see e.g., WO 93/33052, WO 96/41865 and Rivera et al, "A humanized system for 
pharmacologic control of gene expression". Nature Medicine 2(9):1 028-1 032 (1997)), 

35 cyclophilin/cyclosporin (see e.g. WO 94/1 831 7), FKBP/FKCsA/cyclophilin (see e.g. 

Belshaw et al, 1996, PNAS 93:4604-4607), DHFR/methotrexate (see e.g. Licitra et al, 1996, 
Proc. Natl. Acad. Sci. USA 93:12817-12821), and DNA gyrase/coumermycin (see e.g. Farrar 
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et al, 1 996, Nature 383:1 78-1 81 ). Numerous variations and modifications to ligands and 
ligand binding domains, as well as methodologies for designing, selecting and/or 
characterizing them, which may be adapted to the present invention are disclosed in the cited 
references. 

5 B. Allostery-based systems 

For additional guidance on ligands for other systems which may be adapted to this invention, 
see e.g. (Gossen and Bujard Proc. Natl. Acad. Sci. U.S.A. 1992 89:5547, and US Patent 
Nos. 5654168, 5650298, 5589362 and 5464758 (TetR/tetracycline), Wang et al, 1994, Proc. 
Natl. Acad. Sci. USA 91 :81 80-81 84 (progesterone receptor/RU486), and No et al, 1996, 

10 Proc. Natl. Acad.. Sci. USA 93:3346-3351 (ecdysone receptor/ecdysone). 
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4. DMA-binding domains 

Regulated expression systems relevant to this invention involve the use of a protein 
containing a DNA binding domain to selectively target a desired gene for expression. In 
systems based on ligand-mediated cross-linking, the DNA binding domain can be provided in 
a fusion protein with one or more ligand binding domains. In certain embodiments based on 
allosteric mechanisms, e.g. the tetR and progesterone-R-based systems, the transcription 
activation domain is provided as part of a fusion protein which further comprises a DNA- 
binding domain. 

Various DNA binding domains may be incorporated into the design of the chimeric 
transcription factor (or companion fusion protein) so long as a corresponding DNA 
"recognition" sequence is known or can be identified to which the domain is capable of 
binding. One or more copies of the recognition sequence are incorporated into the 
transcription control sequence of the target gene construct. Peptide sequence of human origin 
25 is often preferred, where available, for uses in human gene therapy. Composite DNA binding 
domains provide one means for achieving novel sequence specificity for the protein-DNA 
binding interaction. An illustrative composite DNA binding domain containing component 
peptide sequences of human origin is ZFHD-1 which is described in detail below. Individual 
DNA-binding domains may be further modified by mutagenesis to decrease, increase, or 
change the recognition specificity of DNA binding. These modifications could be achieved by 
rational design of substitutions in positions known to contribute to DNA recognition (often 
based on homology to related proteins for which explicit structural data are available). For 
example, in the case of a homeodomain, substitutions can be made in amino acids in the N- 
terminal arm, first loop, second helix, and third helix known to contact DNA. In zinc fingers, 
35 substitutions can be made at selected positions in the DNA recognition helix. Alternatively, 
random methods, such as selection from a phage display library could be used to identify 
altered domains with increased affinity or altered specificity. Individual DNA-binding domains 
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may be further modified by mutagenesis to decrease, increase, or change the recognition 
specificity of DNA binding. These modifications could be achieved by rational design of 
substitutions in positions known to contribute to DNA recognition (often based on homology 
to related proteins for which explicit structural data are available). For example, in the case of 

5 a homeodomain, substitutions can be made in amino acids in the N-terminal arm, first loop, 
second helix, and third helix known to contact DNA. In zinc fingers, substitutions can be 
made at selected positions in the DNA recognition helix. Alternatively, random methods, such 
as selection from a phage display library can be used to identify altered domains with 
increased affinity or altered specificity. 

10 One type of DNA binding domain of interest is a DNA binding domain which binds to 

a characteristic DNA sequence in a ligand-dependent manner. This sort of DNA binding 
domain is exemplified by the Tet repressor (tetR) and mutated versions thereof which are 
discussed in detail elsewhere herein and in various cited references. 

See the various patent and scientific documents cited herein, and in particular, WO 

1 5 96/20951 , WO 94/1 831 7 and USSN 60/08481 9 for additional guidance on DNA binding 
domains and their use ins such systems, as well as on other aspects of this invention. 

5. Regulatory domains 

Domains which regulate the transcriptional activity of p65-containing chimeric 

20 transcription factors may be included in the chimeric transcription factors of this invention. 

Such domains may be, for example, multimerization domains, as described in Natesan USSN 
09/140,149. One example of such a multimerization domain is the trimerization domain which 
spans amino acids 126-217 of human HSF1 . This domain is required for DNA binding to the 
wild-type heat shock transcription factor (Morimoto 1998, Genes and Development 12:3788- 

25 3796) and may be used e.g., as previously disclosed, to recruit additional activation domains 
to the promoter. Optionally, the regulatory domain of HSF1 , comprising amino acids 201-371 , 
may be used in the chimeric transcription factors of this invention. In some embodiments, a 
smaller fragment comprising amino acids 221 -31 0 may be used. In yet other embodiments, a 
minimal regulatory domain comprising amino acids 300-310 which contains the serine 

30 phosphorylation sites at positions 303 and 307 may be used. If desired, one or more of the 
serines at positions 303 and 307 in the context of the full length or minimal regulatory domain 
may be replaced by different amino acids. As described in Green et al., supra, the regulatory 
domain negatively regulates the transcriptional activity of the HSF domain in the wild type 
transcription factor as well as in fusion proteins containing the GAL4 DNA binding domain. If 

35 lower levels of transcriptional activation are required or desired, this domain may be fused to 
the p65 domain alone or in combination with other potentiating or regulatory domains. 
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Another regulatory domain which may be used in combination with a p65 domain in 
the practice of this invention comprises the peptide sequence spanning amino acids 280-360 
of human NF-kb p65. In constructs expressing p65(281-550), the regulatory region 
stabilizes the activation domain (361-550) and allows higher levels of transcription factor to 
5 be expressed in the cell. 

6. Additional domains 

Additional domains may be included in chimeric proteins of this invention. For 
example, the chimeric proteins may contain a nuclear localization sequence which provides 

10 for translocation of the protein to the nucleus. Typically a nuclear localization sequence has a 
plurality of basic amino acids, referred to as a bipartite basic repeat (reviewed in Garcia- 
Bustos a/, Biochimica et Biophysica Acta (1 991 ) 1 071 , 83-1 01 ). This sequence can 
appear in any portion of the molecule internal or proximal to the N- or C-terminus and results 
in the chimeric protein being localized inside the nucleus. 

15 The chimeric proteins may include domains that facilitate their purification, e.g. 

"histidine tags" or a glutathione-S-transferase domain. They may include "epitope tags" 
encoding peptides recognized by known monoclonal antibodies for the detection of proteins 
within cells or the capture of proteins by antibodies in vitro. 

Transcription factors can be tested for activity in vivo using a simple assay (F.M. 

20 Ausubel et a!., Eds., Current Protocols in Molecular Biology (John Wiley & Sons, New 
York, 1 994); de Wet et a/., Mol. Cell Biol. 7:725 (1 987)). The in vivo assay requires a 
plasmid containing and capable of directing the expression of a recombinant DNA sequence 
encoding the transcription factor. The assay also requires a plasmid containing a reporter 
gene , e.g., the luciferase gene, the chloramphenicol acetyl transferase (CAT) gene, secreted 

25 alkaline phosphatase or the human growth hormone (hGH) gene, linked to a binding site for 
the transcription factor. The two plasmids are introduced into host cells which normally do not 
produce interfering levels of the reporter gene product. A second group of cells, which also 
lack both the gene encoding the transcription factor and the reporter gene, serves as the 
control group and receives a plasmid containing the gene encoding the transcription factor 

30 and a plasmid containing the test gene without the binding site for the transcription factor. 

The production of mRNA or protein encoded by the reporter gene is measured. An 
increase in reporter gene expression not seen in the controls indicates that the transcription 
factor is a positive regulator of transcription. If reporter gene expression is less than that of 
the control, the transcription factor is a negative regulator of transcription. 

35 Optionally, the assay may include a transfection efficiency control plasmid. This 

plasmid expresses a gene product independent of the test gene, and the amount of this 
gene product indicates roughly how many cells are taking up the plasmids and how 
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efficiently the DNA is being introduced into the cells. Additional guidance on evaluating 
chinneric proteins of this invention is provided below. 

7. Promoters 

5 The activation domains described herein nnay be used in combination with any 

promoter that will direct their expression in mammalian cells. The promoter may be a strong 
promoter, such as the human CMV promoter . or a weaker promoter, such as a promoter for 
an endogenous human gene. Other promoters which may be used include, but are not 
limited to, the Rous Sarcoma Virus (RSV) promoter, the retroviral LTR from Murine Moloney 

10 Leukemia Virus (MMLV), the muscle creatine kinase (MCK) enhancer, the SV40 promoter, 
and the CMV enhancer from the major immediate early gene. Genbank accession numbers 
for the above promoters are given in the table below. 



Promoter 


Genbank Accession Number 


CMV 


AF067197 


RSV 


M83237 


MMLV LTR 


M77239 


SV40 


U47120 


CMV enhancer for MIE gene 


K03104 


MCK enhancer 


X67536 



15 In many cases, the selection of promoter will depend upon the configuration of the 

transcription factor, i.e, which activation domain is used and whether any potentiating or 
stabilizing domains are added. Thus, a relatively weaker promoter such as RSV or a 
promoter for an endogenous human gene may be selected for use with a stabilizing domain, 
for example, amino acids 280-320 of p65 or the regulatory domain of HSF. In this type of 

20 configuration, the lower level of transcription which is normally driven by a weaker promoter 
may be compensated for by a more highly expressed, stable transcription factor. 

Exemplary configurations of chimeric transcription factors of this invention include an 
RSV promoter, a p65 (320-550) activation domain and a regulatory domain from HSF. Other 
examples include (i) a CMV promoter and a p65(361-550) activation domain, (ii) a CMV 

25 promoter, a p65(361 -550 activation domain and a HSF(431 -529) activation domain, (iii) an 
MCK enhancer, a p65 (380-320) regulatory domain and a p65(361-550) activation domain 
and (iv) an RSV promoter, a p65(281 -551) activation domain and a HSF(406-530) activation 
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domain. Any other combination of promoters, activation domains and potentiating domains 
contemplated by the practitioner may be used in the practice of this invention. 

8. Transcription factors, additional comments. 

5 In engineering cells for or in whole animals in accordance with this invention, it will 

often be preferred, and in some cases required, that the various domains or subdomains of 
the chimeric transcription factors be derived from proteins of the same species as the host 
cell. Thus, for genetic engineering of human cells, it is often preferred that component peptide 
sequences of human origin be used in some or all cases, rather than of bacterial, yeast or 

10 other non-human source. Transcription factor constructs generally contain (1 ) a promoter 
region consisting minimally of a TATA box and initiator sequence but optionally including 
other transcription factor binding sites; (2) DNA sequence encoding the desired transcription 
factor, including sequences that promote the initiation and termination of translation, if 
appropriate; (3) an optional sequence consisting of a splice donor, splice acceptor, and 

15 intervening intron DNA; and (4) a sequence directing cleavage and polyadenylation of the 
resulting RNA transcript. The practitioner may select a conventional promoter such as the 
widely used hCMV promoter region 

It will be preferred in certain embodiments, especially where DNA is introduced into an 
animal for uptake by cells in situ, that the transcription factors be expressed in a cell-specific 

20 or tissue-specific manner. Such specificity of expression may be achieved by operably 
linking one or more of the DNA sequences encoding the chimeric protein(s) to a cell-type 
specific transcriptional regulatory sequence (e.g. promoter/enhancer). Numerous cell-type 
specific transcriptional regulatory sequences are known. Others may be obtained from genes 
which are expressed in a cell-specific manner. For example, constructs for expressing the 

25 chimeric proteins may contain regulatory sequences derived from genes known to exhibit 
expression in selected tissues. See e.g. PCT/US95/10591 (attorney dkt ARIAD 349), 
especially pp. 36-37. 

9. Target gene constructs 

30 A DNA construct that enables transcription of a target gene to be regulated by a transcription 
factor in accordance with this invention comprises a DNA molecule which includes a synthetic 
transcription unit typically consisting of: (1) one copy or multiple copies of a DNA sequence 
recognized with high-affinity by the transcription factor or one or more of its component DNA 
binding domains; (2) a promoter sequence consisting minimally of a TATA box and initiator 

35 sequence but optionally including other transcription factor binding sites; (3) sequence 

encoding the desired product, including sequences that promote the initiation and termination 
of translation, if appropriate; (4) an optional sequence consisting of a splice donor, splice 
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acceptor, and intervening intron DNA; and (5) a sequence directing cleavage and 
polyadenylation of the resulting RNA transcript. Typically the gene construct contains a copy 
of the target gene to be expressed, operably linked to a transcription control sequence 
comprising a minimal promoter and one or more copies of a DNA recognition sequence 
5 responsive to the transcription factor. 

(a) Target genes 

A wide variety of genes can be employed as the target gene, including genes that 
encode a therapeutic protein, antisense sequence or ribozyme of interest. The target gene 

10 can be any sequence of interest which provides a desired phenotype. It can encode a 
surface membrane protein, a secreted protein, a cytoplasmic protein, or there can be a 
plurality of target genes encoding different products. The target gene may be an antisense 
sequence which can modulate a particular pathway by inhibiting a transcriptional regulation 
protein or turn on a particular pathway by inhibiting the translation of an inhibitor of the 

15 pathway. The target gene can encode a ribozyme which may modulate a particular 

pathway by interfering, at the RNA level, with the expression of a relevant transcriptional 
regulator or with the expression of an inhibitor of a particular pathway. The proteins which 
are expressed, singly or in combination, can involve homing, cytotoxicity, proliferation, 
immune response, inflammatory response, clotting or dissolving of clots, hormonal regulation, 

20 etc. The proteins expressed may be naturally-occurring proteins, mutants of naturally- 
occurring proteins, unique sequences, or combinations thereof. 

Various secreted products include hormones, such as insulin, human growth hormone, 
glucagon, pituitary releasing factor, ACTH, melanotropin, relaxin, etc.; growth factors, such as 
EGF, IGF-1, TGF-alpha, -beta, PDGF, G-CSF, M-CSF, GM-CSF, FGF, erythropoietin, 

25 thrombopoietin, megakaryocyte stimulating and growth factors, etc.; interleukins, such as IL-1 
to -13; TNF-alpha and -beta, etc.; and enzymes and other factors, such as tissue 
plasminogen activator, members of the complement cascade, perforins, superoxide 
dismutase, coagulation factors, antithrombin-lll, Factor Vlllc, vWF, Factor IX, alpha-anti- 
trypsin, protein C, protein S, endorphins, dynorphin, bone morphogenetic protein, CFTR, 

30 etc. 

The gene can encode a naturally-occurring surface membrane protein or a protein 
made so by introduction of an appropriate signal peptide and transmembrane sequence. 
Various such proteins include homing receptors, e.g. L-selectin (Mel-14), blood-related 
proteins, particularly having a kringle structure, e.g. Factor Vlllc, Factor VlllvW, hematopoietic 
35 cell markers, e.g. CDS, CD4, CDS, B cell receptor, TCR subunits alpha, beta, gamma or 
delta, CD10, CD19, CD28, CD3S, CDS8, CD41, etc., receptors, such as the interleukin 
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receptors IL-2R, IL-4R, etc., channel proteins, for influx or efflux of ions, e.g. H+, K+, 
Na+, CI-, etc., and the like; CFTR, tyrosine activation motif, zap-70, etc. 

Proteins may be modified for transport to a vesicle for exocytosis. By adding the 
sequence from a protein which is directed to vesicles, where the sequence is modified 
5 proximal to one or the other terminus, or situated in an analogous position to the protein 
source, the modified protein will be directed to the Golgi apparatus for packaging in a vesicle. 
This process in conjunction with the presence of the chimeric proteins for exocytosis allows 
for rapid transfer of the proteins to the extracellular medium and a relatively high localized 
concentration. 

10 Also, intracellular proteins can be of interest, such as proteins in metabolic pathways, 

regulatory proteins, steroid receptors, transcription factors, etc., depending upon the nature of 
the host cell. Some of the proteins indicated above can also serve as intracellular proteins. 

By way of further illustration, in T-cells, one may wish to introduce genes encoding 
one or both chains of a T-cell receptor. For B-cells, one could provide the heavy and light 

15 chains for an immunoglobulin for secretion. For cutaneous cells, e.g. keratinocytes, 

particularly stem cells keratinocytes, one could provide for protection against infection, by 
secreting alpha, beta or gamma interferon, antichemotactic factors, proteases specific for 
bacterial cell wall proteins, etc. 

In addition to providing for expression of a gene having therapeutic value, there wilt 

20 be many situations where one may wish to direct a cell to a particular site. The site can 

include anatomical sites, such as lymph nodes, mucosal tissue, skin, synovium, lung or other 
internal organs or functional sites, such as clots, injured sites, sites of surgical manipulation, 
inflammation, infection, etc. By providing for expression of surface membrane proteins which 
will direct the host cell to the particular site by providing for binding at the host target site to a 

25 naturally-occurring epitope, localized concentrations of a secreted product can be achieved. 
Proteins of interest include homing receptors, e.g. L-selectin, GMP140, CLAM-1 , etc., or 
addressins, e.g. ELAM-1 , PNAd, LNAd, etc., clot binding proteins, or cell surface proteins that 
respond to localized gradients of chemotactic factors. There are numerous situations where 
one would wish to direct cells to a particular site, where release of a therapeutic product could 

30 be of great value. 

(b) Minimal Promoters. 

Minimal promoters may be selected from a wide variety of known sequences, 
including promoter regions from fos, hCMV, SV40 and IL-2, among many others. Illustrative 
35 examples are provided which use a minimal CMV promoter or a minimal IL2 gene promoter (- 
72 to +45 with respect to the start site; Siebenlist et aL, MCB 6:3042-3049, 1986). 
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Promoter 



Genbank Accession Number 



CMV 



AF067197 



fos 



M 16287 



IL-2 



U57613 



SV40 



U47120 



CMV enhancer for MIE gene 



K03104 



(c) DNA recognition sequences. 

Recognition sequences for a wide variety of DNA-binding domains are known. DNA 
5 recognition sequences for otiier DNA binding domains may be determined experimentally, in 
the case of a composite DNA binding domain, DNA recognition sequences can be determined 
experimentally, as described below, or the proteins can be manipulated to direct their 
specificity toward a desired sequence. A desirable nucleic acid recognition sequence for a 
composite DNA binding domain consists of a nucleotide sequence spanning at least ten, 
10 preferably eleven, and more preferably twelve or more bases. The component binding 
portions (putative or demonstrated) within the nucleotide sequence need not be fully 
contiguous; they may be interspersed with "spacer'' base pairs that need not be directly 
contacted by the chimeric protein but rather impose proper spacing between the nucleic acid 
subsites recognized by each module. These sequences should not impart expression to 
1 5 linked genes when introduced into cells in the absence of the engineered DNA-binding 
protein. 

To identify a nucleotide sequence that is recognized by a chimeric protein containing a 
DNA-binding region, preferably recognized with high affinity (dissociation constant 10-11 M 
or lower are especially preferred), several methods can be used. If high-affinity binding sites 

20 for individual subdomains of a composite DNA-binding region are already known, then these 
sequences can be joined with various spacing and orientation and the optimum configuration 
determined experimentally (see below for methods for determining affinities). Alternatively, 
high-affinity binding sites for the protein or protein complex can be selected from a large pool 
of random DNA sequences by adaptation of published methods (Pollock, R. and Treisman, 

25 R., 1990, A sensitive method for the determination of protein-DNA binding specificities. Nucl. 
Acids Res. 1 8, 61 97-6204). Bound sequences are cloned into a plasmid and their precise 
sequence and affinity for the proteins are determined. From this collection of sequences, 
individual sequences with desirable characteristics {i.e., maximal affinity for composite 
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protein, minimal affinity for individual subdomains) are selected for use. Alternatively, the 
collection of sequences is used to derive a consensus sequence that carries the favored 
base pairs at each position. Such a consensus sequence is synthesized and tested to 
confirm that it has an appropriate level of affinity and specificity, 
5 The target gene constructs may contain multiple copies of a DNA recognition sequence. For 
instance, the constructs may contain 5, 8, 10 or 12 recognition sequences for GAL4 or for 
ZFHD1. 

(d) Determination of binding affinity. 

10 A number of well-characterized assays are available for determining the binding 

affinity, usually expressed as dissociation constant, for DNA-binding proteins and the 
cognate DNA sequences to which they bind. These assays usually require the preparation 
of purified protein and binding site (usually a synthetic oligonucleotide) of known 
concentration and specific activity. Examples include electrophoretic mobility-shift assays, 

15 DNasel protection or *1ootprinting", and filter-binding. These assays can also be used to get 
rough estimates of association and dissociation rate constants. These values may be 
determined with greater precision using a BIAcore instrument. In this assay, the synthetic 
oligonucleotide is bound to the assay "chip," and purified DNA-binding protein is passed 
through the flow-cell. Binding of the protein to the DNA immobilized on the chip is measured 

20 as an increase in refractive index. Once protein is bound at equilibrium, buffer without protein 
is passed over the chip, and the dissociation of the protein results in a return of the refractive 
index to baseline value. The rates of association and dissociation are calculated from these 
cun/es, and the affinity or dissociation constant is calculated from these rates. Binding rates 
and affinities for the high affinity composite site may be compared with the values obtained 

25 for subsites recognized by each subdomain of the protein. As noted above, the difference in 
these dissociation constants should be at least two orders of magnitude and preferably three 
or greater. 

(e) Testing for function in vivo. 

30 Several tests of increasing stringency may be used to confirm the satisfactory 

performance of a DNA-binding protein designed according to this invention. All share 
essentially the same components: (1 ) (a) an expression plasmid directing the production of a 
chimeric protein comprising the DNA-binding region and a transcriptional activation domain or 
(b) one or more expression plasmids directing the production of a pair of chimeric proteins of 

35 this invention which are capable of dimerizing in the presence of a corresponding dimerizing 
agent, and thus forming a protein complex containing a DNA-binding region on one protein 
and a transcription activation domain on the other; and (2) a reporter plasmid directing the 
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expression of a reporter gene, preferably identical in design to the target gene described 
above (i.e., multiple binding sites for the DNA-binding donnain, a minimal promoter element, 
and a gene body) but encoding any conveniently measured protein. 

In a transient transfection assay, the above-mentioned plasmids are introduced 

5 together into tissue culture cells by any conventional transfection procedure, including for 
example calcium phosphate co-precipitation, electroporation, and lipofection. After an 
appropriate time period, usually 24-48 hr, the cells are harvested and assayed for 
production of the reporter protein. In embodiments requiring dimerization of chimeric proteins 
for activation of transcription, the assay is conducted in the presence of the dimerizing agent. 

10 In an appropriately designed system, the reporter gene should exhibit little activity above 
background in the absence of any co-transfected plasmid for the composite transcription 
factor (or in the absence of dimerizing agent in embodiments under dimerizer control). In 
contrast, reporter gene expression should be elevated in a dose-dependent fashion by the 
inclusion of the plasmid encoding the composite transcription factor (or plasmids encoding the 

15 multimerizable chimeras, following addition of multimerizing agent). This result indicates that 
there are few natural transcription factors in the recipient cell with the potential to recognize 
the tested binding site and activate transcription and that the engineered DNA-binding 
domain is capable of binding to this site inside living cells. 

The transient transfection assay is not an extremely stringent test in most cases, 

20 because the high concentrations of plasmid DNA in the transfected cells lead to unusually 
high concentrations of the DNA-binding protein and its recognition site, allowing functional 
recognition even with relative low affinity interactions. A more stringent test of the system is a 
transfection that results in the integration of the introduced DNAs at near single-copy. Thus, 
both the protein concentration and the ratio of specific to non-specific DNA sites would be 

25 very low; only very high affinity interactions would be expected to be productive. This 
scenario is most readily achieved by stable transfection in which the plasmids are 
transfected together with another DNA encoding an unrelated selectable marker {e.g., G418- 
resistance). Transfected cell clones selected for drug resistance typically contain copy 
numbers of the nonselected plasmids ranging from zero to a few dozen. A set of clones 

30 covering that range of copy numbers can be used to obtain a reasonably clear estimate of 
the efficiency of the system. 

Perhaps the most stringent test involves the use of a viral vector, typically a 
retrovirus, that incorporates both the reporter gene and the gene encoding the composite 
transcription factor or multimerizable components thereof. Virus stocks derived from such a 

35 construction will generally lead to single-copy transduction of the genes. 

If the ultimate application is gene therapy, it may be preferred to construct transgenic animals 
carrying similar DNAs to determine whether the protein is functional in an animal. 
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10. Design and assembly of the DNA constructs 

Constructs may be designed in accordance with the principles, illustrative examples 

5 and materials and methods disclosed in the patent documents and scientific literature cited 
herein, each of which is incorporated herein by reference, with modifications and further 
exemplification as described herein. Components of the constructs can be prepared in 
conventional ways, where the coding sequences and regulatory regions may be isolated, as 
appropriate, ligated, cloned in an appropriate cloning host, analyzed by restriction or 

10 sequencing, or other convenient means. Particularly, using PGR, individual fragments 

including all or portions of a functional unit may be isolated, where one or more mutations may 
be introduced using "primer repair", ligation, in vitro mutagenesis, etc. as appropriate. In the 
case of DNA constructs encoding fusion proteins, DNA sequences encoding individual 
domains and sub-domains are joined such that they constitute a single open reading frame 

15 encoding a fusion protein capable of being translated in cells or cell lysates into a single 
polypeptide harboring all component domains. The DNA construct encoding the fusion 
protein may then be placed into a vector that directs the expression of the protein in the 
appropriate cell type(s). For biochemical analysis of the encoded chimera, it may be 
desirable to construct plasmids that direct the expression of the protein in bacteria or in 

20 reticulocyte-lysate systems. For use in the production of proteins in mammalian cells, the 
protein-encoding sequence is introduced into an expression vector that directs expression in 
these cells. Expression vectors suitable for such uses are well known in the art. Various 
sorts of such vectors are commercially available. 

25 11. Cells 

This invention is particularly useful for the engineering of animal cells and in 
applications involving the use of such engineered animal cells. The animal cells may be 
insect, worm or mammalian or other animal cells. While various mammalian cells may be used, 
including, by way of example, equine, bovine, ovine, canine, feline, murine, and non-human 

30 primate cells, human cells are of particular interest. Among the various species, various types 
of cells may be used, such as hematopoietic, neural, glial, mesenchymal, cutaneous, 
mucosal, stromal, muscle (including smooth muscle cells), spleen, reticuloendothelial, 
epithelial, endothelial, hepatic, kidney, gastrointestinal, pulmonary, fibroblast, and other cell 
types. Of particular interest are hematopoietic cells, which may include any of the nucleated 

35 cells which may be involved with the erythroid, lymphoid or myelomonocytic lineages, as 
well as myoblasts and fibroblasts. Also of interest are stem and progenitor cells, such as 



33 



10 



hematopoietic, neural, stromal, muscle, hepatic, pulmonary, gastrointestinal and mesenchymal 
stem cells 

The cells may be autologous cells, syngeneic cells, allogeneic cells and even in some 
cases, xenogeneic cells with respect to an intended host organism. The cells may be 
modified by changing the major histocompatibility complex ("MHC") profile, by inactivating 
beta2-microglobulin to prevent the formation of functional Class I MHC molecules, 
inactivation of Class II molecules, providing for expression of one or more MHC molecules, 
enhancing or inactivating cytotoxic capabilities by enhancing or inhibiting the expression of 
genes associated with the cytotoxic activity, or the like. 

In some instances specific clones or oligoclonal cells may be of interest, where the cells have 
a particular specificity, such as T cells and B cells having a specific antigen specificity or 
homing target site specificity. 

Constructs encoding the chimeric transcription factors or other fusion proteins and constructs 
comprising target genes can be introduced into the cells as one or more DNA molecules or 
1 5 constructs, in many cases in association with one or more markers to allow for selection of 
host cells which contain the construct(s). The construct(s) once completed and demonstrated 
to have the appropriate sequences may then be introduced into a host cell by any 
convenient means. The constructs may be incorporated into vectors capable of episomal 
replication (e.g. BPV or EBV vectors) or into vectors designed for integration into the host 
20 cells' chromosomes. The constructs may be integrated and packaged into non-replicating, 
defective viral genomes like Adenovirus, Adeno-associated virus (AAV), or Herpes simplex 
virus (HSV) or others, including retroviral vectors, for infection into cells. Viral delivery 
systems are discussed in greater detail below. Alternatively, the construct may be introduced 
by protoplast fusion, electroporation, biolistics, calcium phosphate transfection, lipofection, 
microinjection of DNA or the like. The host cells will in some cases be grown and expanded 
in culture before introduction of the construct(s), followed by the appropriate treatment for 
introduction of the construct(s) and integration of the construct(s). The cells will then be 
expanded and screened by virtue of a marker present in the constructs. Various markers 
which may be used successfully include hprt, neomycin resistance, thymidine kinase, 
30 hygromycin resistance, etc., and various cell-surface markers such as Tac, CDS, CDS, Thyl 

and the NGF receptor. 

In some instances, one may have a target site for homologous recombination, where 
it is desired that a construct be integrated at a particular locus. For example, one can delete 
and/or replace an endogenous gene (at the same locus or elsewhere) with a recombinant 
target construct of this invention. For homologous recombination, one may generally use 
either Q or 0-vectors. See, for example, Thomas and Capecchi, Cell (1987) 51 , 503-512; 
Mansour, et al., Nature (1988) 336, 348-352; and Joyner, et al.. Nature (1989) 338, 153-156. 
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The constructs may be introduced as a single DNA molecule encoding all of the 
genes, or different DNA molecules having one or more genes. The constructs may be 
introduced simultaneously or consecutively, each with the same or different markers. 
Vectors containing useful elements such as bacterial or yeast origins of replication, selectable 
5 and/or amplifiable markers, promoter/enhancer elements for expression in prokaryotes or 
eukaryotes, and mammalian expression control elements, etc. which may be used to 
prepare stocks of construct DNAs and for carrying out transfections are well known in the art, 
and many are commercially available. 

10 

12. Introduction of Constructs into Animals 

Cells which have been modified ex vivo with the DNA constructs may be grown in 
culture under selective conditions and cells which are selected as having the desired 
construct(s) may then be expanded and further analyzed, using, for example, the 

15 polymerase chain reaction for determining the presence of the construct in the host cells 
and/or assays for the production of the desired gene product(s). Once modified host cells 
have been identified, they may then be used as planned, e.g. grown in culture or introduced 
into a host organism. 

Depending upon the nature of the cells, the cells may be introduced into a host 

20 organism, e.g. a mammal, in a wide variety of ways. Hematopoietic cells may be 

administered by injection into the vascular system, there being usually at least about 104 
cells and generally not more than about 10^0 cells. The number of cells which are employed 
will depend upon a number of circumstances, the purpose for the introduction, the lifetime of 
the cells, the protocol to be used, for example, the number of administrations, the ability of 

25 the cells to multiply, the stability of the therapeutic agent, the physiologic need for the 
therapeutic agent, and the like. Generally, for myoblasts or fibroblasts for example, the 
number of cells will be at least about 104 and not more than about 109 and may be applied 
as a dispersion, generally being injected at or near the site of interest. The cells will usually 
be in a physiologically-acceptable medium. 

30 Cells engineered in accordance with this invention may also be encapsulated, e.g. 

using conventional biocompatible materials and methods, prior to implantation into the host 
organism or patient for the production of a therapeutic protein. See e.g. Hguyen et al, Tissue 
Implant Systems and Methods for Sustaining viable High Cell Densities within a Host, US 
Patent No. 5,314,471 (Baxter International, Inc.); Uludag and Sefton, 1993, J Biomed. Mater. 

35 Res. 27(10):1213-24 (HepG2 cells/hydroxyethyl methacrylate-methyl methacrylate 

membranes); Chang et al, 1 993, Hum Gene Ther 4(4):433-40 (mouse Ltk- cells expressing 
hGH/immunoprotective perm-selective alginate microcapsules; Reddy et al, 1993, J Infect 
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Dis 1 68(4):1 082-3 (alginate); Tai and Sun, 1 993, FASEB J 7(11 ):1 061 -9 (mouse fibroblasts 
expressing hGH/alginate-poly-L-lysine-alginate membrane); Ac et al, 1995, Transplanataion 
Proc. 27(6):3349, 3350 (alginate); Rajotte et al, 1995, Transplantation Proc. 27(6):3389 
(alginate); Lakey et al. 1995, Transplantation Proc. 27(6):3266 (alginate); Korbutt et al, 1995, 

5 Transplantation Proc. 27(6);3212 (alginate); Dorian et al, US Patent No. 5,429,821 (alginate); 
Emerich etal, 1993, Exp Neurol 122(1):37-47 (polymer-encapsulated PC12 cells); Sagen et 
al, 1993, J Neurosci 13(6):2415-23 (bovine chromaffin cells encapsulated in semipermeable 
polymer membrane and implanted into rat spinal subarachnoid space); Aebischer et al, 1994, 
Exp Neurol 126(2):151-8 (polymer-encapsulated rat PC12 cells implanted into monkeys; see 

10 also Aebischer, WO 92/19595); Savelkoul et al, 1994, J Immunol Methods 170(2):1 85-96 
(encapsulated hybridomas producing antibodies; encapsulated transfected cell lines 
expressing various cytokines); Winn etal, 1994, PNAS USA 91(6):2324-8 (engineered BHK 
cells expressing human nerve growth factor encapsulated in an immunoisolation polymeric 
device and transplanted into rats); Emerich et al, 1994, Prog Neuropsychopharmacol Biol 

15 Psychiatry 18(5):935-46 (polymer-encapsulated PCI 2 cells implanted into rats); Kordower et 
al, 1994, PNAS USA 91 (23):1 0898-902 (polymer-encapsulated engineered BHK cells 
expressing hNGF implanted into monkeys) and Butler et al WO 95/04521 (encapsulated 
device). The cells may then be introduced in encapsulated form into an animal host, 
preferably a mammal and more preferably a human subject in need thereof. Preferably the 

20 encapsulating material is semipermeable, permitting release into the host of secreted proteins 
produced by the encapsulated cells. In many embodiments the semipermeable 
encapsulation renders the encapsulated cells immunologically isolated from the host organism 
in which the encapsulated cells are introduced. In those embodiments the cells to be 
encapsulated may express one or more chimeric proteins containing component domains 

25 derived from proteins of the host species and/or from viral proteins or proteins from species 
other than the host species. For example in such cases the chimeras may contain elements 
derived from GAL4 and VP16. The cells may be derived from one or more individuals other 
than the recipient and may be derived from a species other than that of the recipient organism 
or patient. 

30 Instead of ex vivo modification of the cells, in many situations one may wish to 

modify cells in vivo. For this purpose, various techniques have been developed for genetic 
modification of target tissue and cells in vivo. A number of viral vectors have been 
developed, such as adenovirus, adeno-associated virus, and retroviruses, which allow for 
transduction and, in some cases, integration of the virus into the host. See, for example, 

35 Dubensky et al. (1 984) Proc. Natl. Acad. Sci. USA 81 , 7529-7533; Kaneda et al., (1 989) 
Science 243,375-378; Hiebert et al. (1989) Proc. Natl. Acad. Sci. USA 86, 3594-3598; 
Hatzoglu et al. (1990) J. Biol. Chem. 265, 17285-17293 and Ferry, et al. (1991) Proc. Natl. 
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Acad. Sci. USA 88, 8377-8381 . The vector may be administered by injection, e.g. 
intravascularly or intramuscularly, inhalation, or other parenteral mode. Non-viral delivery 
methods such as administration of the DNA via complexes with liposomes or by injection, 
catheter or biolistics may also be used. See e.g. WO 96/41865. PCT/US97/22454 and 

5 USSN 60/08481 9. for example, for additional guidance on formulation and delivery of 
recombinant nucleic acids to cells and to organisms. Those references as well as the 
references cited previously, including those relating to tetR-based systems, progesterone-r- 
based systems and ecdysone-based systems provide detailed additional guidance on the 
preparation, formulation and delivery of various ligands to cells in vitro and to organisms. As 

10 mentioned elsewhere, the contents of those cited documents are incorporated herein by 
reference. 

In accordance with in vivo genetic modification, the manner of the modification will 
depend on the nature of the tissue, the efficiency of cellular modification required, the number 
of opportunities to modify the particular cells, the accessibility of the tissue to the DNA 

15 composition to be introduced, and the like. By employing an attenuated or modified retrovirus 
carrying a target transcriptional initiation region, if desired, one can activate the virus using 
one of the subject transcription factor constructs, so that the virus may be produced and 
transfect adjacent cells. 

The DNA introduction need not result in integration in every case. In some situations, 

20 transient maintenance of the DNA introduced may be sufficient. In this way, one could have 
a short term effect, where cells could be introduced into the host and then turned on after a 
predetermined time, for example, after the cells have been able to home to a particular site. 

13. Applications 

25 This invention is applicable to any situation that calls for expression of an 

endogenous or exogenously-introduced gene, e.g. that is embedded within a large genome. 
The desired expression level could be preset very high or very low. The system may be 
further engineered to achieve regulated or titratable expression. See e.g. PCT/US93/01617 
and other previously cited references. In most cases, the inadvertent activation of unrelated 

30 cellular genes is undesirable. 

1. Regulated high-level gene expression in gene therapy. Gene therapy often 
requires controlled high-level expression of a therapeutic gene, sometimes in a cell-type 
specific pattern. By supplying the therapeutic gene with saturating amounts of an activating 
35 transcription factor in accordance with this invention, considerably higher levels of gene 

expression can be obtained relative to natural promoters or enhancers, which are dependent 
on endogenous transcription factors. In addition, by supplying a ligand binding domain, 
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regulated expression of the therapeutic gene may be achieved. Thus, one application of this 
invention to gene therapy is the delivery of a two-transcription-unit cassette (which nnay 
reside on one or two plasmid molecules, depending on the delivery vector) consisting of (1 ) a 
transcription unit encoding a transcription factor, whether naturally occurring or designed as 
5 described above, for instance comprising a composite DNA-binding domain and a strong 
transcription activation domain (e.g., derived from the VP1 6 protein or a human transcription 
factor) linked to a ligand binding domain which, upon binding to ligand regulates transcription 
of the target gene and (2) a transcription unit consisting of the target gene linked to and under 
the control of a minimal promoter carrying one, and preferably several, binding sites for the 

10 composite DNA-binding domain. Cointroduction of the two transcription units into a cell 
results in the production of the hybrid transcription factor which in turn activates the 
therapeutic gene to high level. This strategy essentially incorporates an amplification step, 
because the promoter that would be used to produce the therapeutic gene product in 
conventional gene therapy is used instead to produce the activating transcription factor. Each 

15 transcription factor has the potential to direct the production of multiple copies of the 

therapeutic protein. In some cases, the application may involve a three-transcription unit 
cassette, where the DNA binding domain and activation domains are each linked to a ligand 
binding domain. In such cases, binding of the ligand to each of the ligand binding domains 
reconstitutes the transcription factor and allows transcription to proceed, as described in US 

20 5,830,462, the contents of which are incorporated herein by reference. 

This method may be employed to increase the efficacy of many gene therapy 
strategies by substantially elevating the expression of a therapeutic target gene, allowing 
expression to reach therapeutically effective levels. Examples of therapeutic genes that 
would benefit from this strategy are genes that encode secreted therapeutic proteins, such 

25 as cytokines (e.g., IL-2, IL-4, IL-12), CFTR (see e.g. Grubb et al, 1994, Nature 371:802-6), 
growth factors (e.g., VEGF), antibodies, and soluble receptors. Other candidate therapeutic 
genes are disclosed in PCT/US93/01617. This strategy may also be used to increase the 
efficacy of "intracellular immunization" agents, molecules like ribozymes, antisense RNA, and 
dominant-negative proteins, that act either stoichiometrically or by competition. Examples 

30 include agents that block infection by or production of HIV or hepatitis virus and agents that 
antagonize the production of oncogenic proteins in tumors. 

It should be appreciated that in practice, the system is subject to many variables, 
such as the efficiency of expression and, as appropriate, the level of secretion, the activity 
of the expression product, the particular need of the patient, which may vary with time and 

35 circumstances, the rate of loss of the cellular activity as a result of loss of cells or expression 
activity of individual cells, and the like. Therefore, it is expected that for each individual 
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patient, even if there were universal cells which could be administered to the population at 
large, each patient would be monitored for the proper dosage for the individual. 

2. Production of recombinant proteins. Production of recombinant therapeutic proteins for 
5 commercial and investigational purposes is often achieved through the use of mammalian cell 

lines engineered to express the protein at high level. The use of mammalian cells, rather than 
bacteria or yeast, is indicated where the proper function of the protein requires post- 
translational modifications not generally performed by heterologous cells. Examples of 
proteins produced commercially this way include erythropoietin, tissue plasminogen activator, 
10 clotting factors such as Factor Vlll:c, antibodies, etc. The cost of producing proteins in this 
fashion is directly related to the level of expression achieved in the engineered cells. Thus, 
because the constitutive two-transcription-unit system described above can achieve 
considerably higher expression levels than conventional expression systems, it may greatly 
reduce the cost of protein production. 

15 

3. Biological research. This invention is applicable to a wide range of biological 
experiments in which precise control over a target gene is desired. These include: (1 ) 
expression of a protein or RNA of interest for biochemical purification; (2) tissue or organ 
specific expression of a protein or RNA of interest in transgenic animals for the purposes of 

20 evaluating its biological function. Transgenic animal models and other applications for which 
this invention may be used include those disclosed in US Patent Application Serial Nos, 
08/292,595 and 08/292,596 (filed August 18, 1994). 

This invention further provides kits useful for practicing the described methods. Such kits 
25 contain a first DNA sequence encoding a transcription factor and a second DNA sequence 
containing a target gene linked to a DNA element to which the transcription factor is capable 
of binding. Alternatively, the second DNA sequence may contain a cloning site for insertion of 
a desired target gene by the practitioner. The kits optionally also contain a ligand useful for 
regulated expression of the target gene. 



The following examples contain important additional information, exemplification and 
guidance which can be adapted to the practice of this invention in its various embodiments 
35 and the equivalents thereof. The examples are offered by way illustration should not be 
construed as limiting in any way. The contents of all cited references including literature 
references, issued patents, published patent applications as cited throughout this application 
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are hereby expressly incorporated by reference. The practice of the present invention will 
employ, unless otherwise indicated, conventional techniques of cell biology, cell culture, 
molecular biology, transgenic biology, microbiology, recombinant DNA, and immunology, 
which are within the skill of the art. Such techniques are explained fully in the literature. See, 
5 for example. Molecular CloningtA LaboratorytManual, 2nd Ed., ed. by Sambrook, Fritsch 
and Maniatis (Cold Spring Harbor Laboratory Press: 1989); DNA Cloning, Volumes I and II 
(D. N. Glover ed., 1 985); Oligonucleotide Synthesis (M. J. Gait ed., 1 984); Mullis et al. U.S. 
Patent No: 4,683,195; Nucleic Acid Hybridization (B. D. Hames & S. J. Higgins eds. 1984); 
Transcription And Translation (B. D. Hames & S. J. Higgins eds. 1984); Culture Of Animal 

10 Cells (R. I. Freshney, Alan R. Liss, Inc., 1987); Immobilized Cells And Enzymes (IRL Press, 
1986); B. Perbal, A Practical Guide To Molecular Cloning (1984); the treatise, Methods In 
Enzymology (Academic Press, Inc., N.Y.); Gene Transfer Vectors For Mammalian Cells (J. H. 
Miller and M. P. Calos eds., 1987, Cold Spring Harbor Laboratory); Methods In Enzymology, 
Vols. 154 and 155 (Wu et al. eds.). Immunochemical Methods In Cell And Molecular Biology 

15 (Mayer and Walker, eds., Academic Press, London, 1987); Handbook Of Experimental 
Immunology, Volumes l-IV (D. M. Weir and C. C. Blackwell, eds., 1986); Manipulating the 
Mouse Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986). 
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Examples 

I. Individual DNA-binding and transcription activating components are 
modular, may be incorporated into fusion proteins with various other domains and 
5 function as intended in cell culture and in animals: 

A. ZFHD1 and p65 work well individually in cell culture and in whole animals in drug- 
dependent (regulatable) transcription systems 

10 1. Constructs encoding chimeric transcription factors 

(a) Unless otherwise stated, all DNA manipulations described in this and other examples 
were performed using standard procedures (See e.g., F.M. Ausubel ef a/., Eds., Current 
Protocols in Molecular Biology (John Wiley & Sons, New York, 1994). 

15 

(b) Plasmids 

Constructs encoding fusions of human FKBP12 (hereafter 'FKBP') with the yeast GAL4 DNA 
binding domain, the HSV VP1 6 activation domain, human T cell CDS zeta chain intracellular 
domain or the intracellular domain of human FAS are disclosed in PCT/US94/01617. 

20 Additional DNA vectors for directing the expression of fusion proteins relevant to this 
invention were derived from the mammalian expression vector pCGNN (Attar, P.M. and 
Oilman, M.Z. 1992. /WCS 12: 2432-2443). inserts cloned as Xbal-BamHI fragments into 
pCGNN are transcribed under the control of the human CMV promoter and enhancer 
sequences (nucleotides -522 to +72 relative to the cap site), and are expressed with an 

25 optional epitope tag (a 1 6 amino acid portion of the H. influenzae hemagglutinin gene that is 
recognized by the monoclonal antibody 12CA5) and, in the case of transcription factor 
domains, with an N-terminal nuclear localization sequence (NLS; from SV40 T antigen). 
Except where stated, all fragments cloned into pCGNN were inserted as Xbal-BamHI 
fragments that included a Spel site just upstream of the BamHI site. As Xbal and Spel 

30 produce compatible ends, this allowed further Xbal-BamHI fragments to be inserted 
downstream of the initial insert and facilitated stepwise assembly of proteins comprising 
multiple components. A stop codon was interposed between the Spel and BamHI sites. For 
initial constructs, the vector pCGNN-GAL4 was additionally used, in which codons 1-94 of 
the GAL4 DNA binding domain gene were cloned into the Xbal site of pCGNN such that a 

35 Xbal site is regenerated only at the 3' end of the fragment. Thus Xbal-BamHI fragments could 
be cloned into this vector to generate GAL4 fusions, and subsequently recovered. 
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(c) Constructs encoding GAL4 DNA binding domain- FRAP fusions 

To obtain portions of the human FRAP gene, human thymus total RNA (Clontech #64028-1) 
was reverse transcribed using MMLV reverse transcriptase and random hexamer primer 
(Clontech 1st strand synthesis kit). This cDNA was used directly in a PCR reaction 
5 containing primers 1 and 2 and Pfu polymerase (Stratagene). The primers were designed to 
amplify the coding sequence for amino acids 2025-211 3 inclusive of human FRAP: an 89 
amino acid region essentially corresponding to the minimal 'FRB' domain identified by Chen et 
al. {Proa Natl. Acad ScL USA (1995) 92, 4947-4951) as necessary and sufficient for FKBP- 
rapamycin binding (hereafter named FRB). The appropriately-sized band was purified, 
10 digested with Xbal and Spel, and ligated into Xbal-Spel digested pCGNN-GAL4. This 
construct was confirmed by restriction analysis (to verify the correct orientation) and DNA 
sequencing and designated pCGNN-GAL4-1 FRB. 

Constructs encoding FRB multimers were obtained by isolating the FRB Xbal-BamHI 
fragment, and then ligating it back into pCGNN-GAL4-1 FRB digested with Spel and BamHI 
15 to generate pCGNN-GAL4-2FRB, which was confirmed by restriction analysis. This 

procedure was repeated analogously on the new construct to yield pCGNN-GAL4-3FRB 
and pCGNN-GAL4-4FRB. 

Vectors were also constructed that encode larger fragments of FRAP, encompassing the 
minimal FRB domain (amino acids 2025-21 1 3) but extending beyond it. PCR primers were 
20 designed that amplify various regions of FRAP flanked by 5' Xbal and 3' Spel sites as 
indicated below. 



Designation 


amino acids 


5' primer 




FRAPa 


2012-2127 


6 


7 


FRAPb 


1995-2141 


5 


8 


FRAPc 


1945-2113 


3 


2 


FRAPd 


1995-2113 


5 


2 


FRAPe 


2012-2113 


6 


2 


FRAPf 


2025-2127 


1 


7 


FRAPg 


2025-2141 


1 


8 


FRAPh 


2025-2174 


1 


4 


FRAPj 


1945-2174 


3 


4 



Initially, fragment FRAPj was amplified by RT-PCR as described above, digested with Xbal 
35 and Spel, and ligated into Xbal-Spel digested pCGNN-GAL4. This construct, pCGNN- 



42 




GAL4-FRAPi, was analyzed by PGR to confirm insert orientation and verified by DNA 
sequencing. It was then used as a PGR substrate to annplify the other fragments using the 
primers listed. The new fragments were cloned as GAL4 fusions as described above to yield 
the constructs pGGNN-GAL4-FRAPa, pGGNN-GAL4-FRAPb etc, which were confirmed by 

5 DNA sequencing. 

Vectors encoding concatenates of two of the larger FRAP fragments, FRAPd and FRAPe, 

were generated by analogous methods to those used earlier. Xbal-BamHI fragments 
encoding FRAPd and FRAPe were isolated from pGGNN-GAL4-FRAPd and pGGNN-GAL4- 

FRAPe and ligated back into the same vectors digested with Spel and BamHI to generate 

10 pGGNN-GAL4-2FRAPd and pGGNN-GAL4-2FRAPe. This procedure was repeated 

analogously on the new constructs to yield pGGNN-GAL4-3FRAPd, pGGNN-GAL4- 

3FRAPe, pGGNN-GAL4-4FRAPd and pGGNN-GAL4-4FRAPe. All constructs were verified 

by restriction analysis. 

1 5 (d) Gonstructs encoding FRAP-VP16 activation domain fusions 

To generate N-terminal fusions of FRB domain(s) with the activation domain of the Herpes 
Simplex Virus protein VP16, the Xbal-BamHI fragments encoding 1 , 2, 3 and 4 copies of FRB 
were recovered from the GAL4 fusion vectors and ligated into Xbal-BamHI digested pGGNN 
to yield pGGNN-1 FRB. pGGNN-2FRB etc. These vectors were then digested with Spel 

20 and BamHI. An Xbal-BamHI fragment encoding amino acids 414-490 of VP16 was isolated 
from plasmid pGG-Gal4-VP16 (Das, G., Hinkley, G.S. and Herr, W. (1995) Nature 374, 657- 
660) and ligated into the Spel-BamHI digested vectors to generate pGGNN-1 FRB-VP16, 
pGGNN-2FRB-VP16, etc. The constructs were verified by restriction analysis and/or DNA 
sequencing. 

25 

(e) Constructs encoding ZFHD1 DNA binding domain- FRAP fusions 
An expression vector for directing the expression of ZFHD1 coding sequence in mammalian 
cells was prepared as follows. Zif268 sequences were amplified from a cDNA clone by PGR 
using primers 5'Xba/Zif and 3'Zif+G. Octi homeodomain sequences were amplified from a 

30 cDNA clone by PGR using primers 5'Not Oct HD and Spe/Bam 3'Oct. The Zif268 PGR 
fragment was cut with Xbal and Notl. The OctI PGR fragment was cut with NotI and BamHI. 
Both fragments were ligated in a 3-way ligation between the Xbal and BamHI sites of 
pGGNN (Attar and Gilman, 1 992) to make pGGNNZFHDI in which the cDNA insert is under 
the transcriptional control of human GMV promoter and enhancer sequences and is linked to 

35 the nuclear localization sequence from SV40 T antigen. The plasmid pGGNN also contains a 
gene for ampicillin resistance which can serve as a selectable marker. (Derivatives, 
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pCGNNZFHD1-FKBPx1 and pCGNNZFHD1-FKBPx3, were prepared containing one or 
three tandem repeats of human FKBP12 ligated as an Xbal-BamHI fragment between the 
Spe1 and BamHI sites of pCGNNZFHDI. A sample of pCGNNZFHD1-FKBPx3 has been 
deposited with the American Type Culture Collection under ATCC Accession No, 97399.) 

5 

Primers: 

5'Xba/Zif 5'ATGCTCTAGAGAACGGCCATATGCTTGCCCT 
3Zif+G 5ATGCGCGGCCGCCGCCTGTGTGGGTGCGGATGTG 

10 5'Not OctH D 5'ATGCGCGGCCGCAGG AGGAAGAAACGCACCAGC 

Spe/Bam 3'Oct 5GCATGGATCCGATTCAACTAGTGTTGATTC I I I I I I CTTTCTGGCGGCG 

To generate C-terminal fusions of FRB domain(s) with the chimeric DNA binding protein 
ZFHD1 , the Xbal-BamHI fragments encoding 1 , 2, 3 and 4 copies of FRB were recovered 
15 from the GAL4 fusion vectors and ligated into Spe-BamHI digested pCGNN-ZFHD1 to yield 
pCGNN-ZFHD1-1FRB, pCGNN-ZFHD1 -2FRB etc. Constructs were verified by restriction 
analysis and/or DNA sequencing. 

To examine the effect of introducing additional 'linker' polypeptide between ZFHD1 and a C- 
terminal FRB domain, FRAP fragments encoding extra sequence N-terminal to FRB were 
20 cloned as ZFHD1 fusions. Xbal-BamHI fragments encoding FRAPa, FRAPb, FRAPc, FRAPd 

and FRAPe were excised from the vectors pCGNN-GAL4-FRAPa, pCGNN-GAL4-FRAPb 
etc and ligated into Spel-BamHI digested pCGNN-ZFHD1 to yield the vectors pCGNN- 
ZFHDI -FRAPa, pCGNN-ZFHD1 -FRAPb, etc. Vectors encoding fusions of ZFHD1 to 2, 3 
and 4 C-terminal copies of FRAPe were also constructed by isolating Xbal-BamHI fragments 
25 encoding 2FRAPe, 3FRAPe and 4FRAPe from pCGNN-GAL4-2FRAPe, pCGNN-GAL4- 
3FRAPe and pCGNN-GAL4-4FRAPe and ligating them into Spel-BamHI digested pCGNN- 
ZFHDI to yield the vectors pCGNN-ZFHD1-2FRAPe, pCGNN-ZFHD1-3FRAPe and 
pCGNN-ZFHD1-4FRAPe. All constructs were verified by restriction analysis. 

Vectors were also constructed that encode N-terminal fusions of FRB domain(s) with ZFHD1 . 
30 Xbal-BamHI fragments encoding 1,2,3 and 4 copies of FRAPe were isolated from pCGNN- 

GAL4-1 FRAPe, pCGNN-GAL4-2FRAPe etc and ligated into Xbal-BamHI digested pCGNN 

to yield the plasmids pCGNN-1 FRAPe, pCGNN-2FRAPe etc. These vectors were then 

digested with Spel and BamHI, and an Xbal-BamHI fragment encoding ZFHD1 (isolated from 
pCGNN-ZFHDI ) ligated in to yield the constructs pCGNN-1 FRAPe-ZFHD1 , pCGNN- 

35 2FRAPe-ZFHD1 etc, which were verified by restriction analysis. 
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(f) Constructs encoding FRAP-p65 activation domain fusions 

To generate fusions of FRB domain(s) with the activation domain of the human NF-kb p65 
subunit (hereafter designated p65), two fragments were amplified by PGR from the plasmid 

5 pCG-p65. Primers 9 (p65/ 5' Xba) and 11 (p65 3' Spe/Bam) amplify the coding sequence for 
amino acids 450-550, and primers 10 (p65/361 Xba) and 11 amplify the coding sequence for 
amino acids 361-550, both flanked by 5' Xbal and 3" Spel/BamHI sites. PGR products were 
digested with Xbal and BamHI and cloned into Xbal-BamHI digested pGGNN to yield 
pGGNN-p65(450-550) and pGGNN-p65(361-550). The constructs were verified by 

10 restriction analysis and DNA sequencing. 

The 100 amino acid P65 transcription activation sequence is encoded by the following linear 
sequence: 

CTGGGGGCCTTGCTTGGCAACAGCACAGACCCAGCTGTGTTCACAGACCTGGCATCCGTCGACAA 
15 CTGGGAGTTTGAGCAGCTGCTGAACGAGGGCATACCTGTGGGCCCGCAGACAACTGAGCCCATGG 
TGATGGAGTACCCTGAGGCTATAACTCGGCTAGTGACAGGGGCCCAGAGGCCCCCCGACCCAGCT 
GCTGCTCCACTGGGGGGCGCGGGGGTCCCGAATGGCCTCCTTTCAGGAGATGAAGACTTCTCCTC 
CATTGCGGACATGGACTTCTCAGCCCTGCTGAGTCAGATCAGCTCC 

20 The more extended p65 transcription activation domain (351-550) is encoded by the 
following linear sequence: 

GATGAGTTTCCCACCATGGTGTTTCCTTCTGGGCAGATCAGCGAGGCGTCGGCCTTGGCCCCGGCC 
CCTCCCGAAGTCCTGCCCGAGGCTCGAGCGCCTGCCGCTGCTCCAGCCATGGTATCAGCTCTGGC 

25 CCAGGGGCCAGCCGCTGTCCCAGTCCTAGCCCCAGGCCCTCCTCAGGGTGTGGCGCCACCTGCCC 
CCAAGCCCACCCAGGCTGGGGAAGGAAGGGTGTCAGAGGCCCTGCTGCAGCTGCAGTTTGATGAT 
GAAGACCTGGGGGCCTTGCTTGGGAAGAGGACAGACGGAGCTGTGTTGAGAGACGTGGCATCCGT 
CGACAACTCCGAGTTTCAGCAGCTGCTGAACCAGGGCATACGTGTGGCCCCCCACACAACTGAGC 
CCATGCTGATGGAGTACCCTGAGGCTATAACTCGCCTAGTGAGAGCGCAGAGGCGCCGCGACCCA 

30 GCTCCTGCTCCAGTGGGGGCGGCGGGGGTCCGGAATGGGGTGGTTTGAGGAGATGAAGAGTTGTG 
CTCCATTGCGGACATGGACTTCTCAGCCCTGCTGAGTCAGATCAGCTCCTAA 

To generate N-terminal fusions of FRB domain(s) with portions of the p65 activation domain, 
plasmids pGGNN-1FRB, pGGNN-2FRB etc were digested with Spel and BamHI. An Xbal- 
35 BamHI fragment encoding p65 (450-550) was isolated from pGGNN-p65(450-550) and 

ligated into the Spel-BamHI digested vectors to yield the plasmids pGGNN-1 FRB-p65(450- 
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550) pCGNN-2FRB-p65(450-550) etc. The construct pCGNN-1 FRB-p65(361-550) was 
made similarly using an Xbal-BamHl fragment isolated from pCGNN-p65(361-550). These 
constructs were verified by restriction analysis. 

To examine the effect of introducing additional Tinker' polypeptide between the p65 activat.on 
domain and an N-terminal FRB domain, FRAP fragments encoding extra sequence C-term,nal 
to FRB were cloned as p65 fusions. Xbal-BamHl fragments encoding FRAPa, FRAPb, 
FRAPf FRAPg and FRAPh were excised from the vectors pCGNN-GAL4-FRAPa. pCGNN- 
GAL4-FRAPb etc and ligated into Xbal-BamHl digested pCGNN to yield the vectors 
pCGNN-FRAPa, pGGNN-FRAPb. etc. These plasmids were then digested with Spel and 
BamHI and a Xbal-BamHl fragment encoding p65 (amino acids 450-550) ligated in to yield 
the five vectors pCGNN-FRAPa-p65, pCGNN-FRAPb-p65, etc, which were verified by 
restriction analysis. 

vectors encoding fusions of p65 to 1 and 3 N-terminal copies of FRAPe were also prepared 
by digesting pCGNN-1 FRAPe and pCGNN-3FRAPe with Spel and BamHI. Xbal-BamHl 
fragments encoding p65(450-550) and p65(361 -550) (isolated from PCG^N^^^^^^^^^ 
and pCGNN-p65(361 -550)) were then ligated in to yield the vectors pCGNN-1 FRAPe- 
p65(450-550), pCGNN-3FRAPe-p65(450-550). pCGNN-1FRAPe-p65(361-550) and 
pCGNN-3FRAPe-p65(361-550). All constructs were verified by restriction analysis, 
vectors were also constructed that encode C-termina, fusions of FRB do-j(s) w^^^^^^^^^^ 
of the p65 activation domain. Plasmids pCGNN-p65(450-550) and pCGNN-p65(361-550) 
were digested with Spel and BamHI, and Xbal-BamHl fragments encoding 1 and 3 copies of 
FRAPe (isolated from pCGNN-GAL4-1 FRAPe and pCGNN-GAL4-3FRAPe) and 1 copy of 
FRB (isolated from pCGNN-GAL4-1 FRB) ligated in to yield the plasmids PCGNN-p65(450- 
550)-1 FRAPe pCGNN-p65(450-550)-3FRAPe, pCGNN-p65(361-550)-1 FRAPe. pCGNN- 
p65(361-550)-3FRAPe. pCGNN-p65(450-550)-1FRB and pCGNN-p65(361-550)-lFRB. All 
constructs were verified by restriction analysis. 
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(n) Pi irthfir constructs 

Other constructs can be made analogously with the above procedures, but using alternative 
portions of the FRAP sequence. For example, primers 12 and 13 are used to amplify the 
entire coding region of FRAP. Primers 1 and 13, 6 and 13, and 5 and 13, are used to amplify 
three fragments encompassing the FRB domain and extending through to the C-terminal end 
of the protein (including the lipid kinase homology domain). These fragments differ by 
encoding different portions of the protein N-terminal to the FRB domain. In each case RT- 
PCR is used as described above to amplify the regions from human thymus RNA, the PGR 



46 



products are purified, digested with Xbal ar.d Spel, ligated into Xbal-Spel digested pCGNN. 
and verified by restriction analysis and DNA sequencing. 



Primpr Rfiquences 

5 1 
2 



5- nr.ATG TCTAGA GAGATGTGGCATGAAGGCCTGGAAG 
5' GCATCACIAGICTTTGAGATTCGTCGGAACACATG 

3 5' GCACATIGIAGAATTGATACGCCCAGACCCTTG 

4 5' CGATCAACIAGIAAGTGTCAATTTCCGGGGCCT 

5 5' GCACTAIQIAfiACTGAAGAACATGTGTGAGCACAGC 
10 6 5' GCACTAICIAGAGTGAGCGAGGAGCTGATCCGAGTG 

7 5' CGATCAAQIAGIGGAAACATATTGCAGCTCTAAGGA 

8 5' CGATCAACIAGITGGCACAGCCAATTCAAGGTCCCG 

9 5' ATGCICIAGACTGGGGGCCTTGCTTGGCAAC 

10 5' ATGCICIAGAGATGAGTTTCCCACCATGGTG 

15 11 5'GCATGGAICCGCTCAACIAGIGGAGCTGATCTGACTCAG 

12 5' ATGCIGIAGACTTGGAACCGGACCTGCCGCC 

13 5' GCATCACIAGICCAGAAAGGGCACCAGCCAATAT 



Restriction sites are 

20 



underlined (Xbal = TCTAGA, Spel = ACGAGT, BamHI = GGATCC). 
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DNAsegyenceofrerrn-^ fi"^' ^"nstruct: pCaNN:ZFHD1.1FRB 

epitope 

M A S . ^ Y P Y n V ^-D- 
5. gtagaagcgcgt GCT T^T AGC TAT CCT TAT GAC GTG CCT GAC 

SV40 T NLS 



12CA5 



S S 



V V K R K_ 

^;^^'^^^^^^^^CT TCT^ CCT AAG AAG AAG AGA AAG 

(X/S) 



15 



ZFHDl ( 5 ' ) 

^ T^T AGA GAA CGC CCA TAT GCT TCC CCT GTC GAG TCC TCC GA. . . 
20 Xbal 



ZFHDIO') FRB(5') 

R I N T R E M W H E G L E E. . 

25 ACT AGA GAG ATC TGG CAT GAA GGC CTG GAA GA. . . 

(S/X) 

FRB{3' ) 

P T S K T S Y * 
30 CGA ATC TCA AAG ACT AGT TAT TAG ggatcctgag 

Spei BamHI 

Non-coding nucleotides are indicated in lower case 

35 (S/X) and (X/S) indicate the result of a ligation event between the compatible products of 
Sei'on lith Xba. and Spe.. to produce a sequence that is Ceavable by ne,ther enzyme 
* indicates a stop codon 
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oligonucleotides 

10 

5'.GAATTCCTAGAAGCGACCAIGGCTTCTAGC-3' 
and 

5'-GAAGAGAAAGGTG£CIAQCGAACGCCGATAT-3' 
downstream of pBS-IRES to create pBS-IRES-ZFHD1 

,rom this plasmid was nex, cloned into Spel/BamHI-cu, pCQNN-1FRB-p65(361 660) 
create pCGNN-1 FRB-p66(361-650)-lRES-ZFHD1-3FKBP. 

20 p£B Bdmets for jhe USE trirneri;aiiop sigaBio: 

6'primer:atgctctaaaagtgtgtocaccctgaagagtgaagac 
3 primer atgctgatcaagatctttattaactagtgccactgtcgttcagcatcagggggm 
Template: pBSI 08 vector containing human HSFl full length cDNA. 
" ThePCRfragmentscontainlngthetHmerizatiohdomainofHSFIlaminoacidsfS^^^^^^^^^^^^ 

2 Retroviral vectors lor the expression ol chimeric proteins 

Re" w ctors used to expresstranscrlptlonfactorfuslon proteins^ 

Twcopv genes were dehved,rompSRaMSV.KN«>(M*^.a,^ 

:XZ>o produce pSMTN2 and pSMTX2. respectively. pSMTN2 expresses the Neo 
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,ene Irom an internal thymidine kinase promoter. A Zeocin gene (Invitrogen) will be *ned 
ra NM fragment into a unique Xba, site downstream o. an internal thym,d,ne .nase 
promoter in pSMTX2 to yield pSNTZ. This Zeocin fragment was ^^^^^ 
mutagenizlng pZeo/SV (Invitrogen) using the following primers to introduce Nhel 
flanking the zeocin coding sequence. 

Phmer 1 5'-QCCATGGTGGCTAGCCTATAGTGAG 
Primer2 5'-GGCGQTQTTGQCTAQCGTCGGTCAG 

PSMTN2 contains unique EcoRI and Hindlll sites downstream of the LTR To facilte.e 
lonmg of transcription factor fusion proteins synthesized as Xbal-BamHl '-9— « 
Mow ng sequence was inse.ed between the EcoRI and Hindlll sites to create pSMTN3. 



5. a^cagaagcgcgt A«= GCT «^ AGC TAT CCT TAT GAC GI« CCT GAC 



20 



30 
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EcoRI 

SV40 T NLS 



S S P K ^ K R K_ 



TAT GCC AGC CTG GGA GGA CCT TCT_AGT CCT AAG AAG AAG AGA AAG 



25 _V. 



GTC TTT AGA TAT CGA . GGA TCC C AA GCT T 

BamHI Hindlll 



The equivalent fragment is inse«ed into a unique EccRI s e ° PS^^^^ « ae^^/«"^^= 
with the only difference being that the 3' Hindlll site is replaced by an EcoRI s,te. 
rMTN3 and pSMTZ3 permrt chimeric transcription factors to be cloned downstream of the 5 
S™R as XbarBamHl fragments and allow selection for stable integrants by virtue of their 
abililv to confer resistance 10 the antibiotics G418 or Zeocin respectively, 
TO e erl the retroviral vector SMTN-ZFHD, .3FKBP, pCGNN-ZPHDI-SFKBP was firs. 
muLed to add an EcoRI site upstream of the first amino acid of the fusion protein. An Ecol^l- 
IXunted, Cent wasL cloned into EccRLHindllKblunted, pSRaMSVtkNeo ,ref. 
51 ) so that ZFHD1 -3FKBP was expressed from the retroviral LTR. 
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3 Rapamycin-dependent transcriptional activation 

Our previous experiments showed that three copies of FKBP fused either to a Gal4 DNA 
binding domain or a transcription activation domain activated both the stably integrated or 
transiently transfected reporter gene more strongly than corresponding fusion proteins 
containing only one or two FKBP domains. To evaluate this parameter with FRB fusion 
proteins, effector plasmids containing Gal4 DNA binding domain fused to one or more copies 
of an FRB domain were co-transfected with a plasmid encoding three FKBP domains and a 
p65 activation domain (3xFKBP-p65) by transient transfection. The results indicate that in 
this system, four copies of the FRB domain fused to the Gal4 DNA binding domain activated 
the stably integrated reporter gene more strongly than other corresponding fusion proteins 
with fewer FRB domains. 

Method: HT1080 B cells were grown in MEM supplemented with 10% Bovine Calf Serum. 
Approximately 4x1 o5 cells/well in a 6 well plate (Falcon) were transiently transfected by 
Lipofection procedure as recommended by the supplier (GIBCO, BRL). The DNA: 
Lipofectamine ratio used in this experiment correspond to 1 :6. Cells in each well rec.eved 500 
ng of pCGNN F3-p65, 1 .9 ug of PUC 1 18 plasmid as carrier and 100 ng of one of the 
following plasmids: pCGNN Gal4-1FRB, pCGNN Gal4-2FRB, pCGNN Gal4-3FRB or 
pCGNN Gal4-4FRB. Following transfection, 2 ml fresh media was added and supplemented 
with Rapamycin to the indicated concentration. After 24 hrs, 1 00 ul of the media was 
assayed for SEAP activity as described (Spencer et al, 1993). 

To test whether multiple FRB domains fused to a p65 activation domain results in increased 
transcriptional activation of the reporter gene, we co-transfected HT1080 B cells with 
plasmids expressing Gal4-3xFKBP and 1 , 2, 3 or 4 copies of FRB fused to p65 activation 
domain. Surprisingly, unlike the DNA binding domain-FRB fusions, a single copy of FRB 
fused to p65 activation domain activated the reporter gene significantly more strongly than 
corresponding fusion proteins containing 2 or more copies of FRB. 

Method: HT1080 B cells were grown in MEM supplemented with 10% Bovine Calf Serum. 
Approximately 4x1 o5 cells/well in a 6 well plate were transiently transfected by Lipofection 
procedure as recommended by GIBCO, BRL. The DNA: Lipofectamine ratio used 
correspond to 1 :6. Cells in each well recieved 1 .9 ug of PUC 1 1 8 plasmid as camer , 1 00 ng 
of pCGNNGal4F3 and 500 ng one of the following plasmids :pCGNN1 , 2, 3 or 4 FRB-p65. 
Following transfection, 2 ml fresh media was added and supplemented with Rapamycin to 
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the indicated concentration. After 24 hrs, 100 ul of the media was assayed for SEAP activity 
as described (Spencer etal. 1993). .MTiomRiA^ 
Similar experiments were also conducted using another stable cell hne (HT1080 B 4) 
containing the 5xGal4-IL2-SEAP reporter gene and DNA sequences encoding a fus on 
protein containing a Gal4 DNA binding domain and 3 copies of FKBP stably integrated. 
Theirc'ls were transiently transfected with effector plasmids expressing p65 act,va.on 
domain fused to 1 or more copies of an FRB domain. Similar to our observations with HT1 080 
B cells, effector plasmids expressing a single copy of FRB-p65 activation domain fus on 
protein activated the reporter gene more strongly than others with 2 or more copies of FRB. 
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4. Rapamycin-dependent transcriptional activation In transiently transfected 
cells: ZFHD1 and p65 fusions 

Human fibrosarcoma cells transiently transfected with a SEAP target gene and plasmids 
encoding representative ZFHD-FKBP- and FRB-p65-containing fusion proteins exhibited 
rapamycin-dependent and dose-responsive secretion of SEAP into the cell culture medium. 
SEAP production was not detected in cells in which one or both of the transcription factor 
fusion plasmids was omitted, nor was it detected in the absence of added rapamycin. When 
all components were present, however, SEAP secretion was detectable at rapamycin 
concentrations as low as 0.5 nM. Peak SEAP secretion was observed at 5 nM. Similar 
results have been obtained when the same transcription factors were used to dnve 
rapamycin-dependent activation of an hGH reporter gene or a stably integrated version of 
the SEAP reporter gene made by infection with a retroviral vector. It is difficult to determine 
the fold activation in response to rapamycin since levels of SEAP secretion in the absence of 
drug are undetectable, but it is clear that in this system there is at least a 1 000-fold 
enhancement over background levels in the absence of rapamycin. Thus, this system 
exhibits undetectable background activity and high dynamic range. 
Several different configurations for transcription factor fusion proteins were explored. When 
various numbers of copies of FKBP domains were fused to ZFHD1 and various numbers of 
copies of FRBs to p65, optimal levels of rapamycin-induced activation ocurred when there 
were multiple FKBPs fused to ZFHD1 and fewer FRBs fused to p65. The preference for 
multiple drug-binding domains on the DNA-binding protein may reflect the capacity of these 
proteins to recruit multiple activation domains and therefore to elicit higher levels of promoter 
activity The presence of only 1 drug-binding domain on the activation domain should allow 
each FKBP on ZFHD to recruit one p65. Any increase in the number of FRBs on p65 would 
increase the chance that fewer activation domains would be recruited to ZFHD, each one 
linked my multiple FRB-FKBP interactions. 
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HT1080 cells (ATCC CCL-121), derived from a human fibrosarcoma, were grown in MEM 
supplemented with non-essential amino acids and 10% Fetal Bovine Serum. Cells plated in 
24-well dishes (Falcon, 6 x 10^ cells/well) were transfected using Lipofectamine under 
conditions recommended by the manufacturer (GIBCO/BRL). A total of 300 ng of the 
following DNA was transfected into each well: 100 ng ZFHDx12-CMV-SEAP reporter gene, 
2 5ng pCGNN-ZFHD1-3FKBP or other DNA binding domain fusion, 5 ng pCGNN-1FRB- 
p65(361 -550) or other activation domain fusion and 1 92.5 ng pUC1 1 8. In cases where the 
DNA binding domain or activation domain were omitted an equivalent amount of empty 
pCGNN expression vector was substituted. Following lipofection (for 5 hours) 500 ^1 
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S::„ a^^rn. ISon. ground SEAP ac.iv„, measured .c. ™cK-uans,eo,ed 

,00 dishes (2 X 10a ce^s/dis.) were ,rans.ec.ed W calcium V^^^^^ 

... ,r,t, r Kaiser A & Wendenburg, R. , 1991, Mol Gen. Genel. 227, 229-237) 

wrth phosphate buffered saline (PBS) and given fresh medium for 5 hours. To han,es, for 
rlon cells were removed from the dish ir, Hepes Buffered Saline Solufon containing 0 
l EDm washed wi,h PBS«.1% BSA/0.1% glucose and resuspended ,n ,he same a. a 

concentration of 2 x 10^ cells/ml. 
Pl3smicls' 

ConstrucfiOh of the transcription factor fusion plasmids is described above. 

?"re;C""oling12.andemcopiesofaZPHO, binding s,.e(Pomer^^^^ 
!996)Tnd a basal promoter from the immediate early gene of human cytomegalovirus 
e. 19es) driving expression of a gene -ding secre,e^«^^^ 
SEAP) was prepared by replacing the Nhel-Hindlll fragment of pSEAP Promoter 
ISnlTwith the following Nhel-Xba, fragment containing 12 ZFHD binding sites. 

(the ZFHD1 binding sites are underlined), 
" and the following Xbal-Hindlll fragment containing a minimal CMV promoter (-54 to ^5): 
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TCTAGAACQCXSAATICCGSr! 
(the CMV minimal promoter is underlined). 
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:™c;'Colrgene,ea.s.o.. was oons..c.ed 

. cone; Selden e, a,., MCB a:317,.3,79, ,986; me Ba.H, and EcoR, s„es 

were blunted together). 

pZHWTxl 2-IL2-SEAP 2-CMV-SEAP except the Xbal-Hindlll fragment 

„ :irrra;~=^^^ 

TntaWng a minima, ,L2 gene promoter (-72 .o with rsspec, ,o ,hs s.a. s.e; S,eben„s, e. 
al., MCB 6:3042-3049, 1986); 




?o 1acil«a.e .he sfable integraUon o1 a single, or few, copies of reporter gene tire following 
» r^r" weotor was constructed pLH (LTR-fiP-r), which contains the hygromycn B 

Lee gene driven hy the Moloney .urine leukemia virus ^l^^^oZ^X 'O^ 
,te, was constructed asfoilows; ^e hph ge. was*^^ 
DBabe Hvqro (Morganstern and Land, NAR 1 8.35b/' yo, 

b' 0 (resuLg in the loss of the bieo gene; the BamH, and HIndll, sites were blunted 

25 together). 

Tconed into the Cial site o. pLH. I. was oriented such that the directions of transcnp„on 
from the viral LTR and the internal ZFHD-ILS promoters were the same. 
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To c !s l a retroviral vector containing 5 Gal4 sites embedded in a minimal 'L2 Promoter 
SxprlssTon o, the SEAP gene, a Cial-BstB, fragment consisting o, '^^ - 
,„se.ed into the Clal site of pLH such that the directions of transcnptron from the v,rai LTR 
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and the internal Gal4-IL2 promoters were the same: A Clal-Hindlll fragment containing 5 
Gal4 sites (underlined) and regions -324 to -294 (bold) and -72 to +45 of the IL2 gene 
(italics) 

5 . ATCGATCTTTTCTCAGTTACrr^ 

Ar^Ani^TTrY-im3AG CXi:AGAC^ 

and a Hindlll-BstBI fragment containing the SEAP gene coding sequence (Berger et al., 
Gene 66:1-10, 1988) mutagenized to add the following sequence (containing a BstBI site) 

immediately after the stop codon: 
5'-CCCGTGGTCCCGCGTTGCTTCGAT 

5 Rapamycin-dependent transcriptional activation in stably transfected cells 

We conducted the following experiments to confirm that this system exhibits similar 
properties in stably transfected cells. We generated stable cell lines by sequential 
transfection of a SEAP target gene and expression vectors for ZFHD1 -3FKBP and 1 FRB- 
p65 respectively. A pool of several dozen stable clones resulting from the final transfection 
exhibited rapamycin-dependent SEAP production. From this pool, we characterized several 
individual clones, many of which produced high levels of SEAP in response to rapamycin. 
One such clone produced SEAP at levels approximately forty times higher than the pool and 
significantly higher than transiently transfected cells. In an attempt to rigorously quantitate 
background SEAP production and induction ratio in this clone, we performed a second set of 
assays in which the length of the SEAP assay was increased by a factor of approximately 
50 to detect any SEAP activity in untreated cells. Under these conditions, mock transfected 
cells produced 47 arbitrary fluorescence units, while the transfected clone produced 54 units 
in the absence of rapamycin and over 90,000 units at 100 nM rapamycin. Thus, m this stable 
cell line, background gene expression was negligible and the induction ratio (7 units to 90.000 
units) was greater than four orders of magnitude. 

To simplify the task of stable transfection, we used a bicistronic expression vector that 
directs the production of both ZFHD1 -3FKBP and 1 FRB-p65 through the use of an internal 
ribosome entry sequence (IRES). This expression plasmid was cotransfected, together with 
; a zeocin-resistance marker plasmid, into a cell line carrying a retrovi rally-transduced SEAP 
reporter gene, and a pool of approximately fifty drug-resistant clones was selected and 
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expanded This pool of clones also exhibited rapamycin-dependent SEAP production w.th 
no detectable background and a very similar dose-response curve to that observed ,n 
transiently transfected cells. Our results indicate that rapannycin-responsive gene expression 
can be readily obtained in both transiently and stably transfected cells. In both cases, 
regulation is characterized by very low background and high induction ratios. 

Stable cell lines. Helper-free retroviruses containing the reporter gene or DNA binding 
Z' tusion were generated by transient co-transfection of 293T cells (Pear, W.S., Nolan, 
G P Scott M L & Baltimore D.. 1993, Production of high-titer helper-free retroviruses by 
transient transfection. Proc. Natl. Acad. Sci. USA 90. 8392-8396) with a Ps'H amP^^^^^^^^^ 
packaging vectorand the retroviral vectors pLH-ZHWTx1 2-IL2-SEAP or SMTN-ZFHD1 - 
3FKBP respectively. To generate a clonal cell line containing the reporter gene stably 
integrated HT1080 cells infected with retroviral stock were diluted and selected in the 
presence of 300 mg/ml Hygromycin B. Individual clones from this and other cell lines 
described below were screened by transient transfection of the missing components 
followed by the addition of rapamycin as described above. All 12 clones analyzed were 
inducible and had little or no basal activity. The most responsive clone, HT1080L, was 

selected for further study. ^cun-i 
HT20-6 cells, which contain the pLH-ZHWTx12-lL2-SEAP reporter gene, ZFHD1 - 
3FKBP DNA binding domain and 1 FRB-p65(361 -550) activation domain stably integrated, 
were generated by first infecting HT1080L cells with SMTN-ZFHD1-3FKBP-packaged 
retrovLs and selecting in medium containing 500 mg/ml G418. A strongly responsive clone 
HT1 080L3 was then transfected with linearized pCGNN-1 FRB-p65(361 -550) and pZeoSV 
Onvitrogeni and selected in medium containing 250 mg/ml Zeocin. Individual clones were first 
tested for the presence of 1 FRB-p65(361 -550) by western. Eight positive -^on-sj^^ 
analyzed by addition of rapamycin. All eight had low basal activity and in six of them, gene 
expression was induced by at least two orders of magnitude. The clone that gave the 
strongest response. HT20-6, was selected for further analysis. 

HT23 cells were generated by co-transfecting HT1080L cells with l.neanzed pCGNN-1 FRB- 
) p65(361 -550)-IRES-ZFHD1 -3FKBP and pZeoSV and selecting in medium containing 250 
mg/ml Zeocin. Approximately 50 clones were pooled for analysis. 

For analysis, cells were plated in 96-well dishes (1 .5 x 104 cells/well) and 200 jxl 
medium containing the indicated amounts of rapamycin (or vehicle) was added to each well. 
After 1 8 hours, medium was removed and assayed for SEAP activity^ In some cases 
5 medium was diluted before analysis and relative SEAP units obtained multiplied by the fold- 
dilution. Background SEAP activity, measured from untransfected HT1080 cells, was 



subtracted from each value. 
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6. Rapamycin-dependent Production of hGH in iVIice 



10 



15 



20 



,„ ..VoHethoOs: Animal., husbandry, and genera, procedures. Male n^nu mice were 

Lcion o, appeared 1,1 as a resul, o. housing, husbandry techniques, or expenmenta, 

"°'"'To .ransplanttranslently transfeced cells inlo mice, 2 x 106 transfected HT1080 

w ^ , ™i PRq;n 1 % BSA/0 1 % glucose buffer, and administered into 

^^^^^ 

drrrelide and a 9-.1 (v:v) mixture of polyethylene glycol (average molecular weigh. 

To^nX yethylene sorbL monoolea.e. Concentrations of rapamycn. ,n the 

:ZrfoL:,aL,Lresu«icient,oa,lowfor.— ^ 

dose in a 2.0 mM<g injection volume. The accuracy of the dosing ^'"'^^''^ 

HPLC analysis prior to intravenous administration into the tail veins. Some control mice, 

bear^g nolanLted HT,080 cells, received ,0.0 mg/Kg rapamycin. h addition, other 

control mice, bearing transfected cells, received only the ^^^^ 

Blood was collected by either anesthetizing or sacrificing mice via CO2 inhalation. 

Anesthe«zed mice were used to collect 100 ml of blood by cardiac puncture The mice were 

=ndallowedtorecove.o^^^^^^ 

re:r=:rng:nru:iatt.oxgforts.^^^^^^ 

artiments in the ELISA demonstrate no cross reactivity with endogenous, munne 
hGH in diluent sera or native samples. 

hGH expression In Vivo. For the assessment of dose-dependent rapamycin-induced 
ttronToH expression, rapam^ 

following injection of HT1080 cells. Rapamycin doses were either 0.01 , 0.03, 0.1 , 0.3. 
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3,0, or ,0,0 mg/kg. Seventeen hers following rapa^ycin adnninis,ra,lon, me n„ce were 

sacriflcecl for blood collection. - ^^ph pxnression mice received 10,0 mgAg of 

To address the time course of /nwvotiGH expression, mice 

,apamyoin one hour following injection of the cells. Mice were sacrificed at 4, 8, 17, 24, 
" — :r;rcrsustalned expression o, h3H from t™splan.d 

HTtoSorrsted^Jepeatedivadmi.^^^^^^^ 

animals and their control counte^arts were each =^'9- ^^^^ ,,3,„,„,s, 

Jgroups to reduce the frequency of blood collection lor each animal. 

, «nelici.eddose.responsiveproduc.iono,h^^^^ 

levels in humans (0.2-0.3 ng/mi). ino fj.c .r^n^fprtPd cells for hGH production 

experiments, suggestingtnatthema.mal<.p^^^^^^^^^ 

was no. reached, Cont., ^^^J^, ^ detectable serum hGH, Thus, 

both engineered cells and '^P^-yV"^ , ^ ,,pamycin 

The presence of signifioaht leve s of '^^H .^^^^ , half-life of 

administra«on was noteworthy, ^^T^^^lt^^^^,^ that the engineered cells 
,0 less than four minutes in these animals, Th, °''^"^'2a.^^^^^'^^^ To examine the 
continued to secrete 'or n«ny ho.^^^^^^^^^ ^ ^ , 

Kineticsofrapamyc— . GH ^^^^^^^^^^^ 

rapamycin and then nneasurea non i 

obse.ed Within four hours o, -Pa^" ,„„,3 aner 
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ndeed treated animals held relatively stable levels of circulating hGH ,n response to 
Indeed, treatea anir remained constant for 1 6 

repeated doses of rapamycin. After the final dose, hc.h .e ^ 

appall, protein production is rapidiy ,srn,ina.ed upon withdrawal o, drug. 

I 

a regulatable system. 

' B. Hybrid transcription .actors containing such modular components work woii in 

constitutive expression 



Piasmids 
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™™'vectcr ,or directing the expression o, ZFHDt coding sequence in — alian 
An expression ve ,Muences were amplified from a cDNA clone by PGR 

oells was l^^t!^^^'^/^^'^^^^^^ sequences were amplified from a 

Br,"gmen.s were liga.ed in a 3-way ,iga«on between the Xbal and BamH, s,tes o, 
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pCGNN (Attar and Gilman, 1 992) to make pCGNNZFHDI in which the cDNA insert is under 
the transcriptional control of human CMV promoter and enhancer sequences and ,s hnked to 
the nuclear localization sequence from SV40 T antigen. The plasmid pCGNN also contains a 
gene for ampicillin resistance which can serve as a selectable marker. 



pCGNNZFHDI -p65 . 
An expression vector for directing the expression in mammalian cells of a chimenc 
transcription factor containing the composite DNA-binding domain, ZFHD1 , and a transcription 
activation domain from p65 (human) was prepared as follows. The sequence encoding the 
C-terminal region of p65 containing the activation domain (amino acid residues 450-^550) was 
amplified from pCGN-p65 using primers p65 5' Xba and p65 3' Spe/Bam. The PGR fragmerit 
was digested with Xba1 and BamH1 and ligated between the the Spe1 and BamH1 sites of 
pCGNN ZFHD1 to form pCGNN ZFHD-p65AD. 

The P65 transcription activation sequence contains the following linear sequence: 

CTCGGQGCCITCCTTO^ 
TCAGCAGCnGCIGAACCAGGGCATMrro^^ 

CrATT^OaSCCrM^IGACAGOGGOT 

C(XAAraXCTCCTITCAQGAGATC^^ 

GATCAGCTCC 

pCGNNZFHDI -FKBPX3 

An expression vector for directing the expression of ZFHD1 linked to three tandem repeats of 
human FKBP was prepared as follows.Three tandem repeats of human FKBP were isolated 
as an Xbal-BamHI fragment from pCGNNFS and ligated between the Spe1 and BamHI sites 
of pCGNNZFHDI to make pCGNNZFHDI -FKBPxS (ATCC Accession No. 97399). 

dZHWTxSSVSEAP . 
A reporter gene construct containing eight tandem copies of a ZFHD1 binding site (Pomerantz 
etal. 1995) and a gene encoding secreted alkaline phosphatase (SEAP) was prepared by 
ligating the tandem ZFHD1 binding sites between the Nhel and Bglll sites of pSEAP- 
Promoter Vector (Clontech) to form pZHWTxSSVSEAP. The ZHWTxSSEAP reporter 
, contains two copies of the following sequence in tandem: 
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The ZFHD1 binding sites are underlined. 

On!orl'o copieTof FKBP12 were ar^plified fror. pNFSVE using primers FKBP S'^Xba and 
FKBP 3' Spe/ Bam. The PGR fragments were digested with Xba1 and BamH1 and hga ed 
between the Xba1 and BamH1 sites of pCGNN vector to make pCGNN F1 or pPCGNN F2. 
pCGNNZFHDI -FKBPx3 can serve as an alternate source of the FKBP cDNA. 

Mrtgment containing two tandem copies of FKBP was excised from PfGNN F2 by 
digesting with Xba1 and BamH1 . This fragment was iigated between the Spe1 and BamH1 
sites of pCGNN F1. 

'^Tc'lerZZg.^ of the Herpes Simplex Virus protein, VP16 (AA 418-490) containing the 
activation domain was amplified from pCG-Gai4.VP16 using primers VP16 5' Xba and VP 6 
3' Spe/Bam. The PGR fragment was digested with Xba1 and BamH1 and ligated between 
20 the Spe1 and BamH1 sites of pGGNN F3 plasmid. 

pGGNN F3p65 . 
The Xba1 and BamH1 fragment of p65 containing the activation domain was prepared as 
described above. This fragment was iigated between the Spe1 and BamH1 sites of 
25 pCGNN F3. 

Primers 

5'Xba/Zif 5'ATGCTCTAGAGAACGCCCATATGCTTGCCCT 
3,2f+G 5'ATGCGCGGGGGGGGGGTGTGTGGGTGCGGATGTG 
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5'Not OctHD 5'ATGGGGGGGGGGAGGAGGAAGAAAGGGAGGAGG 
Spe/Bam 3'Oct S'GGATGGATGGGATTGAACTAGTGTrGATrGrrTTTTGTTTGTGGGGGGG 

FKBP 5'Xba 5TGAGTGTAGAGGAGTGGAGGTGGAAAGGAT 
35 FKBP 3' Spe/Bam 5TGAGGGATGGTGAATAACTAGTTTGGAGTTTTAGAAGGTC 
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VP1 6 5' Xba 5'ACTGTCTAGAGTCAGCCTGGGGGACGAG 

VP16 3' Spe/Bam 5'GCATGGATGCGATTGAACTAGTCGCAGGGTAGTCGTGAATTCC 

P65 5' Xba 5.ATGCTCTAGACTGGGGGCCTTGCTTGGCAAC 
5 P65 3' Spe/Bam 5'GGATGGATCCGCTCAACTAGTGGAGCTGATCTGACTCAG 

References 

1 Attar, R.M., and M.Z. Oilman 1992. Mol. Cell. Biol. 12:2432-2443 
2. Ausubel, F.M. etal., Eds., 1994. Current Protocols ,n Molecular Biology (Wiley, NY) 
10 3. Pomerantz, J.L., et al. 1995. Science. 267:93-96. 
4. Spencer, D.M., etal. 1993. Science. 262:1019-1024. 
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II. Evaluation of representative illustrative chimeric transcription factors 

n!urencodingthe following GAL-4-based Chi 

were prepared and tested in human cell lines containing stably integrated SEAP reporter 
constructs containing GAL4 or ZFHD1 recognition sequences, as appropriate. 

chimericfactor datashomu^^ 

Fig. 2 



G-K 
G-KK 
G-KKK 
25 G-KKKK 
G-KKKKK 
G-KKKKKK 

G-(V8x2) 
30 G-(V8x2)2 
G-(V8x2)3 
G-(V8x2)4 
G-(V8x2)5 
G-(V8x2)6 
35 



Fig. 3 



__(contlnueci — >) 
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nhimftric factor 

G-D 

G-DD 

G-DDD 

G-DDDD 

G-DDDDD 

G-DDDDDD 

Z-VP16 

Z-k 

Z-kkk 

Z-K 

Z-KKK 



data silown in Figure 



Fig. 4 



Fig. 5 



G-KKK-(V8x2)4 
G-KKK-DDDDD 
G-(V8x2)4-DDDDD 
20 G-KKK-(V8x2)4-DDDDD 

G-K 
G-KKK 
G-HSF-HSF 
25 G-HSF-HSF-HSF-HSF 
G-K-HSF-HSF-HSF-HSF 
G-KKK-HSF-HSF-HSF-HSF 



Fig. 6 



Fig. 7 



abbreviations: G = GAL4 residues 1-94 

K = p65(361-550) = "N361" in Fig. 6 
k = p65(450-550) = "N450" in Fig. 6 

V8x2 = tandem repeat of VP16 V8 sequence with an intervening 

SerArg resulting from ligation; (V8x2)4="8V8" in Fig 6 

D = VP16 C terminal SRDFDLDMLG containing an initial SerArg 
resulting from ligation = 'Vc" in Fig 6 

Z = ZFHD1 ("ZH" in Fig 5) 

HSF = 14 mar (see table below) 
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;^^^;^;;r^S;^„o aCds 1 -94 was digested with Xba1 and Ba.H1 . The p65 
5 aoLion domain sequences beh^een an.ino acids 36, -550 was generated by PCR us,ng 

the following oligonuleotides: 

5'-atgctctagagatgagtttcccaccatggtg-3' 
and 

10 5'-gcatggatccgctcaactagtggagctgatctgactcag-3'. 

This fragment was digested with Xba, and BamH, and cloned into PCG-Gal4 vec^r to 
make PCG-Gal4-p65 (361-550), here after wlli be referred as PCG-GK. To make PCG-GK2 
^asmid ?he P65 ac,iva«on domain containing PCR fragment described above was d^ested 
,s I xba, and BamH, and cloned into Spet and BamH, digested PCG-GK vector. POG- 
GK3, 4, 5.6 were all generated following the same approach. 

Plasmid PCG-Gal 4 plasmids containing reiterated copies of V8 domain were generated by 
the following method. The oligonucleotides 5'-ctagagacttcgacttggacatgct-3', 

20 5'agtcccccagcatgtccaagtcgaagtct-3'; 5'-gggggacttcgacttggacatgctgactagttgag-3 and 5 
gatcctcaactagtcagcatgtccaagtcga-3' were phospho^.ated and f ^^^^ ^^^^^^^^^^ 
were annealed seperately. Together these oligonucleotides make two tandem V8 coding 
sequences These annealed oligos were then ligated into Xba1 and BamH1 digested PCG- 
Gal4 vector. The resulting vector, PCG-GV2 containing two copies of V8 sequences w s 

25 digested with Spe1 and BamH1 . V8x2 oliogos made as described ^^ove was d^^^ 

this vector to make PCG-GV4. Same approach was taken to generate PCG-GV6, 8. 1 0 and 
12 plasmids. 

PCG-Gal4 plasmids containing reiterated copies of VP, 6 C-tem,lnus, hereafter refered as D 
30 activation domain were constructed as follows. The VP, 6 C-termlnus region was PCR 

:::xrr-— ^^^^^^^ 

The PCR fragments were digested with Xba, and BamH, and cloned into PCG-Ga™ 
I eviously digested with Xbal and BamH, . The resulting plasmid was designate as PCG- 
3s GO TO make PCG-GD2, PCG-GD was digested with Spa, and BamH, and Irgated with 
Xba, and BamH, digested D fragment described above. PCG-GD3,4,5 and 6 were 
constructed using the same approach. 
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Plasmids PCG-GK3V8 and PCG-GK3D5 were made by digesting PCG-GV8 and PCG-D5 
plasmids with Xba1 and BamH1 and cloning the fragments containing V8 and D5 sequences 
respectively into PCG-GK3 digested with Spe1 and BamH1 . Similarly. Xba1/BamH1 
5 fragment from PCG-GD5 containing D5 sequences was cloned into Spe1/BamH1 digested 
PCG-GV8 plasmid to construct PCG-V8D5 plasmid. The V8D5 fragment was excised from 
this plasmid by digesting it with Xba1 and BamH1 and the fragment was cloned into 
Spe1/BamH1 digested PCG-K3 to make PCG-K3V8D5 plasmid. 

10 PCGNN-ZFHD-p65(450-550) and PCGNN-ZFHD-p65(361-550) are described above. 
PCGNN-p65{450-550)x3 and PCGNN-ZFHD-p65{361-550) were made as ^^^^o^fJ^G- 
Gal4-p65(450-550)x3 and PCG-Gal4-p65(361-550) were digested with Xba1 and BamH1 
and the p65(450-550)x3 and p65(361-550) were excised. These fragments were cloned into 
Spe1/BamH1 digested PCGNN-ZFHD to generate PCGNN-ZFHD-p65(450-550) and 

15 PCGNN-ZFHD-p56(361-550). 

PCG-Gal4-HSFX2 containing two copies of HSF14 activation domain was made by 
phosphorylating and ligating the following oligonucleotides to Xba1 and BamH1 digested 
PCG-Gal4 plasmid: 

20 

5'-ctagagacaccagtgccctgctggacctgttcagcccctcg-3'; 
5'-ggtcaccgaggggctgaacaggtccagcagggcactggtgtct-3'; 
5'-gtgaccgtgcccgacatgagcctgcctgaccttgacagcag-3' and 
5'-gtgaccgtgcccgacatgagcctgcctgaccttgacagcag-3'. 

Two additional copies of HSF activation domain were added to Spe1/BamH1 digested PCG- 
Gal4-HSFX2 plasmid by the same method to generate PCG-Gal4-HSFX4 plasmui A 
fragment containing four copies of HSF1 4 activation domain was excised from PCG-Gal4- 
HSFX4 by Xbal and BamH1 digestion. The resulting fragment was cloned into Spel and 
30 BamH1 digested PCG-Gal4KX1 and PCG-Gal4KX3 to to make PCG-Gal4-K+HSFX4 or 
PCG-Gal4-K3+HSFX4 plasmids. 
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reporter cell lines 

Human 1080 cells were engineered by the stable introduction of a secreted alkaline 
phosphatatse (SEAP) target gene construct. The target gene construct contained a gene 
encoding SEAP operably linked to a transcription control sequence containing five copies of a 
DNA recognitions sequence for GAL4 and a minimal lL-2 promoter. The resultant cells may be 
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used in experiments such as described in Example 3 in which the cells are further .ransfeCed 
Th DNA -nstruCs encoding various transcription .actors containing one or more DMA 
binding domains recognized by the target gene construct. 

5 plasmid constructions: pLH-G5-lL2.SEAP (as previously described) 

ee„ culture: HT,080 cells (ATCC CCL-t2t,, derived from a ^^^"^^'^'^ZneTeL. 
grown in MEM supplemented with non-essential am.no acids and 10 /« Fetal Bovme ieru 
Helper-free retrolses containing the 5xGAL4-IL2-SEAP reporter gene were generated by 
„ at nt ristecion of 293T cells (Pear, W.S., Nolan, G.P., Scott, . B— 
Production of high-titer helper-free retroviruses by transient translection. Proc. Natl Acad Sc.. 
^9^3^39^ (1 993^ With a Psi<-, ampbotropic packaging vector an^^ — 

ni H «;xGAL4-IL2-SEAP To generate a clonal cell line containing the SEAP reporter 
;r s«yX-^ HTtOSOcellLfected With retroviral stocK were dlluted^^^^^^^^^^^ 

,5 in the presence of 300 mg/ml Hygromycin B. Individual clones were screened for the 
r sen fof tegrated reporter gene by transient translection of a plasmid encoding a 
Teric transcription factor containing a GAL4 DNA binding domain. The most responsive 
Clone, HT1080B, was used for subsequent analysis. 

20 Analysis of chimeric transcription factors 

Transfection- HT1 080 B cells were grown in MEM supplemented with 1 0% Bovine Ca^ 
SerurCproximately 2X105 cells/well in a 12 well plate were transiently transfected by 
Lipo— - recommended by GIBCO, BRL. The DNA:Upofectam,ne ra o 
. r c^^^^^^^^^^ to 1 :6. Ce„s in each well recieved indicated amoun. ot e.ec^^^^^^^^^^^ 
and total DNA concentration in each well was adjusted to 1 .25 ug with PUC1 8 DNA. 
Folng transfection. 1 ml fresh media was added to each well. After 24 hrs, 1 0Oul of the 
media was assayed for SEAP activity as descnbed. 
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Representative results: 

chimeric transcription factor 



number of 


transcription 


activation 


activation 


domains 


(IL2 promoter) 


1 to 6 


++++ 


1 to 6 


-m- 


1 to 6 


-- 


1 to 6 


+++ 


1 to 6 




1 to 6 


-- 


1 to 4 


-HH- 


1 to 4 


-H- 


1 to 8 




1 to 12 


-H- 


1 to 6 


-H-H 


1 to 4 


-H- 



GAL4-p65(361-550) 
GAL4-p65(450-550) 
GAL4-p65(361-450) 

GAL4-K13 (SRDFADMDFDALL, derived from p65) 
GAL4-Oct2 Q domain (aa95-1 60) 
GAL4-Oct2 P domain (aa438-479) 
GAL4-HSF (aa 409-444) 
GAL4-HSF14 (DLDSSLASIQELLS) 
GAL4-EWS11 (SRSYGQQGSGS) 
GAL4-V8X2 (DFDLDMLGDFDLDMLGSR) 
GAL4-D (VP1 6 aa 459-490) 
GAL4-VP1 6 (VP1 6 aa 41 1 -490) 

„,. illustrative chimeric transcription factor for allostery-based systems 

C^tlTl^^^^ for directing the expression of an RU486 dependent transcription factor 
InsSing progesterone receptor .igand binding domain, a GAL4 DNA b.nd.ng doma, 
andT 65 transcription activation domain can be prepared as follows. Primers 5 p65-Bgin 
a d 3' peyeamHl can be used to amplify amino acids 361 to 550 of p65 from plasm. pOG- 
R ivel et a,.. Nature Medicine 2(9):1 028-1 032, 1996). The resulting j-gment wh,ch w,., 
have 5 Bglll and 3' BamHl sites, can then be inserted into the Bglll s.te of plasm.d pGL 
fWana et al PNAS USA 91 :81 80-81 84, 1 994), which contains a truncated human 
ZZ^ rt;IToep^or sequence (amino acids 640-891 ) and a GAL4 DNA binding doma.n 
"Z^^i^^.o acids 1-94). up to 2 nucleotides may be added so that the p65 sequence 
IS in frame downstream of the ATG and upstream of the GAL4 coding region. 

5'-p65-Bglll: agatctXGATGAGTTTCCCACCATG 
3'-p65-BamHl:ggatccXGGAGCTGATCTGACTCAG 

where X is 0, 1 or 2 nucleotides that may be required to create in-frame fusions. 
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and ipR-LBD can be used to amplify amino acids 640-891 of hPRB891 from p.asmid 
pT7bbPRB-891 (Vegeto et al, Cell 69:703-713. 1992). The ^^^^^^^^^^^^^^^^^ 
have 5' Xbal and 3' Spel sites, can be inserted into the Spel s.te of pCGNN-ZFHD1 p65. 
This will place the PR-LBD in-frame and at the carboxy terminus of p65. 

5'-PR-LBD: 5'-tctagaAAAAAGTTCAATAAAGTCAG 
3'-PR-LBD: 5'-actagtGCAGTACAGATGAAGTTG 

rtTA/p65 

A tetracycline inducible transcription factor containing the p65 activafen '^'"'^"^ 
constructed using pUHD17-t (described in US 6,654,168, as foilows. ^^^^ J: 
with Aflll. Remove the protruding 6' end with mung bean nuclease and hgate the s n hete 
Oligonucleotide 5'-CactagtTAACTAAGTAA. The resulting plasmid, rTe,R-Spe, oontarns a 
Spel cleavage site at the very end of the rTetR gene. 

use primers 5'-p65-Xbal and 3--p65-Spel to amplify amino acids 361 to 550 of p65 from 
plasmld PCG-P65 (Rivera et al, supra) and clone this fragment, in frame, into the Spel site of 
rTetR-Spel. 

5'-p65-Xbal: 5'-tctagaGATGAGTTTCCCACCATG 
3'-p65-Spel: 5'-actagtGGAGCTGATCTGACTCAG 

IV. Regulated transcription mediated by the chimeric transcription factor S3H 

The S3H activation domain was used to induce transcription of a SEAP t-get gene in 
response to rapamycin. S3H (p65(281-551)-HSF(406-530))- and S (P65(361-551)) - 
con ling activation domain fusions were stably integrated into HT1080L cells (which 
a^eaT carry a stably integrated target SEAP gene) as part of the transcnption factor vectors 
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pR5N2EN-RHlS3H/Z1F3/Neo and pR5N2EN-RHlS/Z1F3/Neo, respectively. These 
vectors express an activation domain fusion and a DNA binding domain fusion, separated 
by an IRES element, from the RSV promoter/enhancer. The activation domain fusion 
contains a o-myc nuclear localization signal (N2) fused to the T2098L mutant FRB domain 
(RH1 ) fused to either the S3H or S activation domains. The DNA binding domain consists of 
an HA epitope tag (E), a nuclear localization signal from SV40 (N). the ZFHD1 DNA binding 
domain and 3 FKBP domains. As seen in figure 8, expression of SEAP was measured as a 
function of increasing doses of rapamycin. Since the RSV enhancer works relatively poorly 
in HT1 080 cells levels of the transcription factor fusions are relatively low (le. compared to 
their levels when expressed from the CMV enhancer/promoter). When expressed at low 
levels the S activation domain is not potent enough to support rapamycin induced target 
gene expression. However, equivalent expression of the S3H activation domain fusion, 
gives rise to high level activated transcription. 
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